



































Richard P. Feynman 
Albert.n. Hibbs 


eM 
lened by Daniel F. Styer 
awe aS a. k; A w a J ‘ J | 





QUANTUM MECHANICS 
AND PATH INTEGRALS 


Emended Edition 


DOVER PUBLICATIONS, INC. 
MINEOLA, NEW YORK 


Copyright 


Copyright © 1965 by Richard P. Feynman and Albert R. Hibbs 
Emended Edition © 2005 by Daniel F. Styer. 
All rights reserved | 


Bibliographical Note 


This Dover edition, first published in 2010, is an unabridged, 
emended republication of the work originally published in 1965 by 
McGraw-Hill Companies, Inc., New York. 


Library of Congress Cataloging-in-Publication Data 


Feynman, Richard Phillips. 

Quantum mechanics and path integrals / Richard P. Feynman, 

Albert R. Hibbs, and Daniel F. Styer— Emended ed. 
. cm. 

Originally published: Emended edition. New York : McGraw- 
Hill, 2005. 

Includes bibliographical references and index. 

ISBN-13: 978-0-486-47722-0 

ISBN-10: 0-486-47722-3 

1. Quantum Theory. I. Hibbs, Albert R. I. Styer, Daniel F. HI. 

Title. 


QC174.12.F484 2010 
530.12—de22 


2010004550 


Manufactured in the United States by Courier Corporation 
47722303 
www.doverpublications.com 


Preface 


The fundamental physical and mathematical concepts which underlie 
the path integral approach to quantum mechanics were first developed 
by R.P. Feynman in the course of his graduate studies at Princeton, 
although more fully developed ideas, such as those described in this 
volume, were not worked out until a few years later. These early in- 
quiries were involved with the problem of the infinite self-energy of the 
electron. In working on that problem, a “least-action” principle using 
half advanced and half retarded potentials was discovered. The princi- 
ple could deal successfully with the infinity arising in the application of 
classical electrodynamics. 
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The problem then became one of applying this action principle to 
quantum mechanics in such a way that classical mechanics could arise 
naturally as a special case of quantum mechanics when fh was allowed 
to go to zero. 

Feynman searched for any ideas which might have been previously 
worked out in connecting quantum-mechanical behavior with such clas- 
sical ideas as the lagrangian or, in particular, Hamilton’s principle func- 
tion S, the indefinite integral of the lagrangian. During some conversa- 
tions with a visiting European physicist, Feynman learned of a paper in 
which Dirac had suggested that the exponential function of te times the 
lagrangian was analogous to a transformation function for the quantum- 
mechanical wave function in that the wave function at one moment could 
be related to the wave function at the next moment (a time interval € 
later) by multiplying with such an exponential function. 

The question that then arose was what Dirac had meant by the 
phrase “analogous to,” and Feynman determined to find out whether 
or not it would be possible to substitute the phrase “equal to.” A brief 
analysis showed that indeed this exponential function could be used in 
this manner directly. 

Further analysis then led to the use of the exponent of the time 
integral of the lagrangian, S (in this volume referred to as the action), 
as the transformation function for finite time intervals. However, in the 
application of this function it is necessary to carry out integrals over all 
space variables at every instant of time. 

In preparing an article! describing this idea, the idea of “integral 
over all paths” was developed as a way of both describing and evalu- 
ating the required integrations over space coordinates. By this time a 
number of mathematical devices had been developed for applying the 
path integral technique and a number of special applications had been 
worked out, although the primary direction of work at this time was 
toward quantum electrodynamics. Actually, the path integral did not 
then provide, nor has it since provided, a truly satisfactory method of 
avoiding the divergence difficulties of quantum electrodynamics, but it 
has been found to be most useful in solving other problems in that field. 
In particular, it provides an expression for quantum-electrodynamic laws 
in a form that makes their relativistic invariance obvious. In addition, 
useful applications to other problems of quantum mechanics have been 
found. 

The most dramatic early application of the path integral method to 
an intractable quantum-mechanical problem followed shortly after the 


1R.P. Feynman, Space-Time Approach to Non-relativistic Quantum Mechanics, 
Rev. Mod. Phys., vol. 20, pp. 367-387, 1948. 
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discovery of the Lamb shift and the subsequent theoretical difficulties 
in explaining this shift without obviously artificial means of getting rid 
of divergent integrals. The path integral approach provided one way of 
handling these awkward infinities in a logical and consistent manner. 

The path integral approach was used as a technique for teaching 
quantum mechanics for a few years at the California Institute of Tech- 
nology. It was during this period that A.R. Hibbs, a student of Feyn- 
man’s, began to develop a set of notes suitable for converting a lecture 
course on the path integral approach to quantum mechanics into a book 
on the same subject. 

Over the succeeding years, as the book itself was elaborated, other 
subjects were brought into both the lectures of Dr. Feynman and the 
book; examples are statistical mechanics and the variational principle. 
At the same time, Dr. Feynman’s approach to teaching the subject of 
quantum mechanics evolved somewhat away from the initial path in- 
tegral approach. At the present time, it appears that the operator 
technique is both deeper and more powerful for the solution of more 
general quantum-mechanical problems. Nevertheless, the path integral 
approach provides an intuitive appreciation of quantum-mechanical be- 
havior which is extremely valuable in gaining an intuitive appreciation 
of quantum-mechanical laws. For this reason, in those fields of quantum 
mechanics where the path integral approach turns out to be particularly 
useful, most of which are described in this book, the physics student is 
provided with an excellent grasp of basic quantum-mechanical princi- 
ples which will permit him to be more effective in solving problems in 
broader areas of theoretical physics. 


R.P. Feynman 


A.R. Hibbs 


Preface to Emended Edition 


In the forty years since the first publication of Quantum Mechanics and 
Path Integrals, the physics and the mathematics introduced here has 
grown both rich and deep. Nevertheless this founding book — full of 
the verve and insight of Feynman — remains the best source for learning 
about the field. Unfortunately, the 1965 edition was flawed by extensive 
typographical errors as well as numerous infelicities and inconsistencies. 
This edition corrects more than 879 errors, and many more equations 
are recast to make them easier to understand and interpret. Notation is 
made uniform throughout the book, and grammatical errors have been 
corrected. On the other hand, the book is stamped with the rough and 
tumble spirit of a creative mind facing a great challenge. The objective 
throughout has been to retain that spirit by correcting, but not polish- 
ing. This edition does not attempt to add new topics to the book or to 
bring the treatment up to date. However, some comments are added in 
an appendix of notes. (The existence of a relevant comment is signaled 
in the text through the symbol°.) Equation numbers are the same here 
as in the 1965 edition, except that equations (10.63) and (10.64) are 
swapped. 

I thank Edwin Tayor for encouragement and Daniel Keren, Jozef 
Hanc, and especially Tim Hatamian for bringing errors to my attention. 
A research status leave from Oberlin College made this project possible. 

I can well remember the day thirty years ago when I opened the 
pages of Feynman-Hibbs, and for the first time saw quantum mechanics 
as a living piece of nature rather than as a flood of arcane algorithms 
that, while lovely and mysterious and satisfying, ultimately defy under- 
standing or intuition. It is my hope and my belief that this emended 
edition will open similar doors for generations to come. 


Daniel F. Styer 
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The Fundamental Concepts — 
of Quantum Mechanics 


1-1 PROBABILITY IN QUANTUM MECHANICS! 


From about the beginning of the twentieth century experimental physics 
amassed an impressive array of strange phenomena which demonstrated 
the inadequacy of classical physics. The attempts to discover a theoret- 
ical structure for the new phenomena led at first to a confusion in which 
it appeared that light, and electrons, behaved sometimes like waves and 
sometimes like particles. This apparent inconsistency was completely 
resolved in 1926 and 1927 in the theory called quantum mechanics. The 
new theory asserts that there are experiments for which the exact out- 
come is fundamentally unpredictable and that in these cases one has to 
be satisfied with computing probabilities of various outcomes. But far 
more fundamental was the discovery that in nature the laws of com- 
bining probabilities were not those of the classical probability theory of 
Laplace. The quantum-mechanical laws of the physical world approach 
very closely the laws of Laplace as the size of the objects involved in the 
experiments increases. Therefore, the laws of probabilities which are 
conventionally applied are quite satisfactory in analyzing the behavior 
of the roulette wheel but not the behavior of a single electron or a single 
photon of light. 


A Conceptual Experiment. The concept of probability is not 
altered in quantum mechanics. When we say the probability of a certain 
outcome of an experiment is p, we mean the conventional thing, i.e., that 
if the experiment is repeated many times, one expects that the fraction 
of those which give the outcome in question is roughly p. We shall not be 
at all concerned with analyzing or defining this concept in more detail; 
for no departure from the concept used in classical statistics is required. 

What is changed, and changed radically, is the method of calculating 
probabilities. The effect of this change is greatest when dealing with 
objects of atomic dimensions. For this reason we shall illustrate the 
laws of quantum mechanics by describing the results to be expected in 
some conceptual experiments dealing with a single electron. 

Our imaginary experiment is illustrated in Fig. 1-1. At A we have 
a source of electrons S. The electrons at S all have the same energy 


‘Much of the material appearing in this chapter was originally presented as a 
lecture by R.P. Feynman and published as “The Concept of Probability in Quan- 
tum Mechanics” in the Second Berkeley Symposium on Mathematical Statistics and 
Probability, University of California Press, Berkeley, Calif., pp. 533-541, 1951. 
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A C B 


Fig. 1-1 The experimental arrangement. Electrons emitted at A make their way to 
the detector at screen B, but a screen C with two holes is interposed. The detector 
registers a count for each electron which arrives; the fraction which arrives when the 
detector is placed at a distance x from the center of the screen is measured and plotted 
against x, as in Fig. 1-2. 


but come out in all directions to impinge on a screen C. The screen C 
has two holes, 1 and 2, through which the electrons may pass. Finally, 
behind the screen C at plane B we have a detector of electrons which 
may be placed at various distances x from the center of the screen.” 

If the detector is extremely sensitive (as a Geiger counter is) it will 
be discovered that the current arriving at x is not continuous, but cor- 
responds to a rain of particles. If the intensity of the source S is very 
low, the detector will record pulses representing the arrival of individual 
particles, separated by gaps in time during which nothing arrives. This 
is the reason we say electrons are particles. If we had detectors simul- 
taneously all over the screen B, with a very weak source S, only one 
detector would respond, then after a little time, another would record 
the arrival of an electron, etc. There would never be a half response 
of the detector; either an entire electron would arrive or nothing would 
happen. And two detectors would never respond simultaneously (except 
for the coincidence that the source emitted two electrons within the re- 
solving time of the detectors — a coincidence whose probability can be 
decreased by further decrease of the source intensity). In other words, 
the detector of Fig. 1-1 records the passage of a single corpuscular entity 
traveling from S to the point z. 

This particular experiment has never been done in jsi this way.” 
In the following description we are stating what the results would be 
according to the laws which fit every experiment of this type which has 
ever been performed. Some experiments which directly illustrate the 
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conclusions we are reaching here have been done, but such experiments 
are usually more complicated. We prefer, for pedagogical reasons, to 
select experiments which are simplest in principle and disregard the 
difficulties of actually doing them. 

Incidentally, if one prefers, one could just as well use light instead 
of electrons in this experiment. The same points would be illustrated. 
The source S could be a source of monochromatic light and the sensitive 
detector a photoelectric cell or, better, a photomultiplier which would 
record pulses, each being the arrival of a single photon. 

What we shall measure for various positions x of the detector is the 
mean number of pulses per second. In other words, we shall determine 
experimentally the (relative) probability P that the electron passes from 
S to x, as a function of zg. 

The graph of this probability as a function of x is the complicated 
curve illustrated qualitatively in Fig. 1-2a. It has several maxima and 
minima, and there are locations near the center of the screen at which 
electrons hardly ever arrive. It is the problem of physics to discover the 
laws governing the structure of this curve. 

We might suppose (since the electrons behave as particles) that 


I. Each electron which passes from S to xz must go through 
either hole 1 or hole 2. 


(a) (b) (c) (a) 


Fig. 1-2 Results of the experiment. Probability of arrival of electrons at x plotted 
against the position x of the detector. The result of the experiment of Fig. 1-1 is plotted 
here at (a). If hole 2 is closed, so the electrons can go through just hole 1, the result 
is (b). For just hole 2 open, the result is (c). If we imagine that each electron goes 
through one hole or the other, we expect the curve (d) = (b) + (c) when both holes are 
open. This is considerably different from what we actually get, (a). 
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Fig. 1-3 An analogous experiment in wave interference. The complicated curve P(x) 
in Fig. 1-2a is the same as the intensity I(x) of waves which would arrive at x starting 
from S and coming through the holes. At some points x the wavelets from holes 1 and 
2 interfere destructively (e.g., a crest from hole 1 arrives at the same time as a trough 
from hole 2); at others, constructively. This produces the complicated minima and 
maxima, of the curve I(x). 


As a consequence of I we expect that 


II. The chance of arrival at x is the sum of two parts: P4, 
the chance of arrival coming through hole 1, plus Pz, the 
chance of arrival coming through hole 2. 


We may find out if this is true by direct experiment. Each of the 
component probabilities is easy to determine. We simply close hole 2 
and measure the chance of arrival at x with only hole 1 open. This 
gives the chance P, of arrival at x for electrons coming through hole 1. 
The result is given in Fig. 1-25. Similarly, by closing hole 1 we find the 
chance P> of arrival through hole 2 (Fig. 1-2c). 

The sum of these (Fig. 1-2d) clearly is not the same as curve (a). 
Hence experiment tells us definitely that P Æ P,+ Pz, or that assertion II 
is false. 


The Probability Amplitude. The chance of arrival at x with 
both holes open is not the sum of the chance with just hole 1 open plus 
the chance with just hole 2 open. | 

Actually, the complicated curve P(x) is familiar, inasmuch as it is 
exactly the intensity of distribution in the interference pattern to be 
expected if waves starting from S pass through the two holes and im- 
pinge on the screen B (Fig. 1-3). The easiest way to represent wave 
amplitudes is by complex numbers. We can state the correct law for 
P(x) mathematically by saying that P(x) is the absolute square of a 
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certain complex quantity (if electron spin is taken into account, it is a 
hypercomplex quantity) ¢(x) which we call the probability amplitude of 
arrival at x. Furthermore, (x) is the sum of two contributions: ¢1(2), 
the amplitude for arrival at x through hole 1, plus ¢2(x), the amplitude 
for arrival at x through hole 2. In other words, 


III. There are complex numbers @, and @2 such that 


P = |}? (1.1) 
$ = 91 + 92 (1.2) 
and 

P, =|ġ] P= |¢2° (1.3) 


In later chapters we shall discuss in detail the actual calculation of ¢1 
and @2. Here we say only that ¢), for example, may be calculated as 
a solution of a wave equation representing waves spreading from the 
source to hole 1 and from hole 1 to z. This reflects the wave properties 
of electrons (or in the case of light, photons). 

To summarize: We compute the intensity (i.e., the absolute square 
of the amplitude) of waves which would arrive in the apparatus at x and 
then interpret this intensity as the probability that a particle will arrive 
at x. 


Logical Difficulties. What is remarkable is that this dual use of 
wave and particle ideas does not lead to contradictions. This is so only 
if great care is taken as to what kind of statements one is permitted to 
make about the experimental situation. 

To discuss this point in more detail, consider first the situation which 
arises from the observation that our new law III of composition of prob- 
abilities implies, in general, that it is not true that P = P) + Py. We 
must conclude that when both holes are open, it is not true that the 
particle goes through one hole or the other. For if it had to go through 
one or the other, we could classify all the arrivals at x into two disjoint 
classes, namely, those arriving through hole 1 and those arriving through 
hole 2; and the frequency P of arrival at x would surely be the sum of 
the frequency Pı of particles coming through hole 1 and the frequency 
Po of those coming through hole 2. 

To extricate ourselves from the logical difficulties introduced by this 
startling conclusion, we might try various artifices. We might say, for 
example, that perhaps the electron travels in a complex trajectory go- 
ing through hole 1, then back through hole 2 and finally out through 
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hole 1 in some complicated manner. Or perhaps the electron spreads 
out somehow and passes partly through both holes so as to eventually 
produce the interference result III. Or perhaps the chance Pı that the 
electron passes through hole 1 has not been determined correctly inas- 
much as closing hole 2 might have influenced the motion near hole 1. 
Many such classical mechanisms have been tried to explain the result. 
hen light photons are used (in which case the same law III applies), 
the two interfering paths 1 and 2 can be made to be many centimeters 
apart (in space), so that the two alternative trajectories must almost 
certainly be independent. That the actual situation is more profound 
than might at first be supposed is shown by the following experiment. 


The Effect of Observation. We have concluded on logical grounds 
that since P Æ Pi + Po, it is not true that the electron passes through 
either hole 1 or hole 2. But it is easy to design an experiment to test 
our conclusion directly. We have merely to have a source of light behind 
the holes and watch to see through which hole the electron passes (see 
Fig. 1-4). For electrons scatter light, so that if light is scattered behind 
hole 1, we may conclude that an electron passed through hole 1; and if 
it is scattered behind hole 2; the electron has passed through hole 2. 

The result of this experiment is to show unequivocally that the elec- 
tron does pass through either hole 1 or hole 2! That is, for every electron 
which arrives at the screen B (assuming the light is strong enough that 
we do not miss seeing it) light is scattered either behind hole 1 or behind 
hole 2, and never (if the source S is very weak) at both places. (A more 
delicate experiment could even show that the charge passing through 
the holes passes through either one or the other and is in all cases the 
complete charge of one electron and not a fraction of it.) 


Fig. 1-4 A modification of the 
experiment of Fig. 1-1. Here 
we place a lamp L behind the 
screen C and look for light scat- 
tered by the electrons passing 
through hole 1 or hole 2. Witha 
strong lamp every electron is in- 
deed found to pass by one or the 
other hole. But now the proba- 
bility of arrival at x is no longer 
given by the curve of Fig. 1-2a, 
but is instead given by Fig. 1-2d. 
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It now appears that we have come to a paradox. For suppose that 
we combine the two experiments. We watch to see through which hole 
the electron passes and at the same time measure the chance that the 
electron arrives at x. Then for each electron which arrives at x we can 
say experimentally whether it came through hole 1 or hole 2. First 
we may verify that Pı is given by the curve in Fig. 1-2), because if 
we select, of the electrons which arrive at x, only those which appear 
to come through hole 1 (by scattering light there), we find they are, 
indeed, distributed very nearly as in curve (b). (This result is obtained 
whether hole 2 is open or closed, so we have verified that there is no 
subtle influence of closing hole 2 on the motion near hole 1.) If we 
select the electrons scattering light at hole 2, we get (very nearly) P> of 
Fig. 1-2c. But now each electron appears at either 1 or 2 and we can 
separate our electrons into disjoint classes. So, if we take both together, 
we must get the distribution P = P, + Py» illustrated in Fig. 1-2d. And 
experimentally we do! Somehow now the distribution does not show the 
interference effects III of curve (a)! 

What has been changed? When we watch the electrons to see through 
which hole they pass, we obtain the result P = P, + Pj. When we do 
not watch, we get a different result, 


P=(|b1+ ¢2)° AP + Pe 


Just by watching the electrons, we have changed the chance that 
they arrive at x. How is this possible? The answer is that, to watch 
them, we used light and the light in collision with the electrons may 
be expected to alter its motion, or, more exactly, to alter its chance of 
arrival at x. | 

On the other hand, can we not use weaker light and thus expect a 
weaker effect? A negligible disturbance certainly cannot be presumed 
to produce the finite change in distribution from (a) to (d). But weak 
light does not mean a weaker disturbance. Light comes in photons of 
energy hv, where v is the frequency, or of momentum A/A, where À is 
the wavelength. Weakening the light just means using fewer photons, so 
that we may miss seeing an electron. But when we do see one, it means 
a complete photon was scattered and a finite momentum of order h/A 
is given to the electron. 

The electrons that we miss seeing are distributed according to the 
interference law (a), while those we do see and which therefore have 
scattered a photon arrive at x with the probability P = P, + P> in (d). 
The net distribution in this case is therefore the weighed mean of (a) and 
(d). In strong light, when nearly all electrons scatter light, it is nearly 
(d); and in weak light, when few scatter, it becomes more like (a). 


1-2 


1-2 The uncertainty principle 9 


It might still be suggested that since the momentum carried by the 
light is h/A, weaker effects could be produced by using light of a longer 
wavelength A. But there is a limit to this. If light of too long a wave- 
length is used, we shall not be able to tell whether it was scattered from 
behind hole 1 or hole 2; for the source of light of wavelength A cannot 
be located in space with precision greater than order A. 

We thus see that any physical agency designed to determine through 
which hole the electron passes must produce, lest we have a paradox, 
enough disturbance to alter the distribution from (a) to (d). 

It was first noticed by Heisenberg, and stated in his uncertainty 
principle, that the consistency of the then-new mechanics required a 
limitation to the subtlety to which experiments could be performed. 
In our case the principle says that an attempt to design apparatus to 
determine through which hole the electron passed, and delicate enough 
so as not to deflect the electron sufficiently to destroy the interference 
pattern, must fail. It is clear that the consistency of quantum mechanics 
requires that it must be a general statement involving all the agencies of 
the physical world which might be used to determine through which hole 
an electron passes. The world cannot be half quantum-mechanical, half 
classical. No exception to the uncertainty principle has been discovered. 


THE UNCERTAINTY PRINCIPLE 


We shall state the uncertainty principle as follows: Any determination 
of the alternative taken by a process capable of following more than one 
alternative destroys the interference between alternatives. Heisenberg’s 
original statement of the uncertainty principle was not given in the form 
we have used here. We shall interrupt our argument for a few paragraphs 
to discuss Heisenberg’s original statement. 

In classical physics a particle can be described as moving along a def- 
inite trajectory and having, for example, a precise position and velocity 
at any particular time. Such a picture would not lead to the odd results 
that we have seen are characteristic of quantum mechanics. Heisenberg’s 
uncertainty principle gives the limits of accuracy of such classical ideas. 
For example, the idea that a particle has both a definite position and a 
definite momentum has its limitations. A real system (i.e., one obeying 
quantum mechanics) looked upon from a classical view appears to be 
one in which the position or momentum is not definite, but is uncertain. 
The uncertainty in position can be reduced by careful measurement, and 
(by applying different techniques) the uncertainty in momentum can be 
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reduced by careful measurement. But, as Heisenberg stated in his prin- 
ciple, both cannot be accurately known simultaneously; the product of 
the uncertainties of momentum and position involved in any experiment 
cannot be smaller than a number with the order of magnitude of h. 
(Here h = h/2m = 1.055 x 10727 erg-sec, where h is Planck’s constant.) 
That such a result is required by physical cohsistency in the situation 
we have been discussing can be shown by considering still another way 
of trying to determine through which hole the electron passes. 


Example. Notice that if an electron is deflected in passing through 
one of the holes, its vertical component of momentum is changed. Fur- 
thermore, an electron arriving at the detector at x after passing through 
hole 1 is deflected by a different amount, and thus suffers a different 
change in momentum, than an electron arriving at x via hole 2. Sup- 
pose that the screen at C is not rigidly supported, but is free to move 
up and down (Fig. 1-5). Any change in the vertical component of the 
momentum of an electron upon passing through a hole will be accompa- 
nied by an equal and opposite change in the momentum of the screen. 
This change in momentum can be measured by measuring the velocity 
of the screen before and after the passage of an electron. Call 6p the dif- 
ference in momentum change between electrons passing through hole 1 
and hole 2. Then an unambiguous determination of the hole used by a 
particular electron requires a momentum determintation of the screen 
to an accuracy of better than dp. | 





Fig. 1-5 Another modification of the experiment of Fig. 1-1. The screen C is left free to move 
vertically. If the electron passes hole 2 and arrives at the detector (at x = 0, for example), it is 
deflected upward and the screen C will recoil downward. The hole through which the electron 
passes can be determined for each passage by starting with the screen at rest and measuring 
whether it is recoiling up or down afterward. According to Heisenberg’s uncertainty principle, 
however, such precise momentum measurements on screen C are inconsistent with accurate 
knowledge of its vertical position, so we could not be sure that the center line of the holes is 
correctly set. Instead of P(x) of Fig. 1-2a, we get this smeared a little in the vertical direction, 
so it looks like Fig. 1-2d. 
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If the experiment is set up in such a way that the momentum of 
screen C can be measured to the required accuracy, then, since we can 
determine the hole passed through, we must find that the resulting dis- 
tribution of electrons is that of curve (d) of Fig. 1-2. The interference 
pattern of curve (a) must be lost. How can this happen? To under- 
stand, note that the construction of a distribution curve in the plane B 
requires an accurate knowledge of the vertical position of the two holes 
in screen C. Thus we must measure not only the momentum of screen 
C but also its position. If the interference pattern of curve (a) is to be 
established, the vertical position of C must be known to an accuracy of 
better than d/2, where d is the spacing between maxima of the curve 
(a). For suppose the vertical position of C is not known to this accuracy; 
then the vertical position of every point in Fig. 1-2a cannot be specified 
with an accuracy greater than d/2 since the zero point of the vertical 
scale must be lined up with some nominal zero point on C. Then the 
value of P at any particular height x must be obtained by averaging over 
all values within a distance d/2 of x. Clearly, the interference pattern 
will be smeared out by this averaging process. The resulting curve will 
look like Fig. 1-2d. 

The interference pattern in the original experiment is the sign of 
wave-like behavior of the electron. The pattern is the same for any wave 
motion, so we may use the well-known result from the theory of light 
interference that the relation between the separation a of the holes, the 
distance | between screen C and plane B, the wavelength à of the light, 
and d is 


a À 

eA 1.4 

[=| (1.4) 
as shown in Fig 1.6. In Chap. 3 (at Eq. 3.10) we shall find that the wave- 
length of the electron wave is intimately connected with the momentum 


of the electron by the relation 
je (1.5) 


If p is the total momentum of an electron (and we assume all the elec- 
trons have the same total momentum), then for | > a, 


oP ny o (1.6) 
D l 
as shown in Fig. 1-7. It follows that 
h 
d (1.7) 


óp 
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Fig. 1-6 Two beams of light, starting in phase at holes 1 and 2, will interfere constructively 
when they reach the screen B if they take the same time to travel from C to B. This means 
that a maximum in the interference pattern for light beams passing through two holes will 
occur at the center of the screen. As we move down the screen, the next maximum will occur 
at a distance d, which is far enough from the center that, in traveling to this point, the beam 
from hole 1 will have traveled exactly one wavelength A farther than the beam from hole 2. 





B 


Fig. 1-7 The deflection of an electron in passing through a hole in the screen C involves 
a change in momentum dp. This change amounts to the addition of a small component of 
momentum in a direction approximately perpendicular to the original momentum vector. The 
change in energy is completely negligible. For small deflection angles, the total momentum 
vector keeps the same magnitude (approximately). Then the deflection angle is represented to a 
very good approximation by |dp|/|p|. If two electrons, one starting from hole 1 with momentum 
pı and the other starting from hole 2 with momentum pg, reach the same point on the screen B, 
then the angles through which they were deflected must differ by approximately a/l. Since we 
cannot say through which hole an electron has come, the uncertainty in the vertical component 
of momentum which the electron receives on passing through the screen C must be equivalent 
to this uncertainty in deflection angle. This gives the relation |p; — pe2|/|p| = jédp|/|p| = a/l. 
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Since experimentally we find that the interference pattern has been lost, 
it must be that the uncertainty dx in the measurement of the position 
of C is larger than d/2. Thus 


h 


which agrees (in order of magnitude) with the usual statement of the 
uncertainty principle. - 3 

A similar analysis can be applied to the previous measuring device 
where the scattering of light was used to determine through which hole 
the electron passed. Such an analysis produces the same lower limit for 
the uncertainties of measurement. 

The uncertainty principle is not “proved” by considering a few such 
experiments. It is only illustrated. The evidence for it is of two kinds. 
First, no one has yet found any experimental way to defeat the limita- 
tions in measurements which it implies. Second, the laws of quantum 
mechanics seem to require it if their consistency is to be maintained, 
and the predictions of these laws have been confirmed again and again 
with great precision. 3 


INTERFERING ALTERNATIVES 


Two Kinds of Alternatives. From a physical standpoint the two 
routes are independent alternatives, yet the implications that the prob- 
ability is the sum Pı + Po is false. This means that either the premise or 
the reasoning which leads to such a conclusion must be false. Since our 
habits of thought are very strong, many physicists find that it is much 
more convenient to deny the premise than to deny the reasoning. To 
avoid the logical inconsistencies into which it is so easy to stumble, they 
take the following view: When no attempt is made to determine through 
which hole the electron passes, one cannot say it must pass through one 
hole or the other. Only in a situation where apparatus is operating to 
determine through which hole the electron goes is it permissible to say 
that it passes through one or the other. When you watch, you find that 
it goes through either one hole or the other hole; but if you are not 
looking, you cannot say that it goes either one way or the other! Nature 
demands that we walk a logical tightrope if we wish to describe her. 

Contrary to that way of thinking, we shall in this book follow the 
suggestion made in Sec. 1-1 and deny the reasoning; i.e., we shall not 
compute probabilities by adding probabilities for all alternatives. In 
order to make definite the new rules for combining probabilities, it will 
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be convenient to define two meanings for the word “alternative.” The 
first of these meanings carries with it the concept of exclusion. Thus 
holes 1 and 2 are exclusive alternatives if one of them is closed or if 
some apparatus that can unambiguously determine which hole is used 
is operating. The other meaning of the word “alternative” carries with 
it a concept of combination or interference. (The term interference has 
the same meaning here as it has in optics, 1.e., either constructive or 
destructive interference.) Thus we shall say that holes 1 and 2 present 
interfering alternatives to the electron when (1) both holes are open and 
(2) no attempt is made to determine through which hole the electron 
passes. When the alternatives are of this interfering type, the laws of 
probability must be changed to the form given in Eqs. (1.1) and (1.2). 

The concept of interfering alternatives is fundamental to all of quan- 
tum mechanics. In some situations we may have both kinds of alter- 
natives present. Suppose we ask, in the two-hole experiment, for the 
probability that the electron arrives at some point, say, within 1 cm of 
the center of the screen. We mean by this the probability that if there 
were counters arranged all over the screen (so one or another would go 
off when the electron arrived), the counter which went off was within 
1 cm of x = 0. Here the various possibilities are that the electron arrives 
at some counter via some hole. The holes represent interfering alterna- 
tives, but the counters represent exclusive alternatives. Thus we first 
add @; + $2 for a fixed z, square that, and then sum those resultant 
probabilities over x from —0.5 to +0.5 cm. 

It is not hard, with a little experience, to tell which kind of alter- 
native is involved. For example, suppose that information about the 
alternatives is available (or could be made available without altering the 
result), but this information is not used. Nevertheless, in this case a 
sum of probabilities (in the ordinary sense) must be carried out over ez- 
clusive alternatives. These exclusive alternatives are those which could 
have been separately identified by the information. 


Some Illustrations. When alternatives cannot possibly be resolved 
by any experiment, they always interfere. A striking illustration of this 
is the scattering of two nuclei at 90°, say, in the center-of-gravity system, 
as illustrated in Fig. 1-8. Suppose the nucleus starting at A is an alpha 
particle and the one starting at B is some other nucleus. Ask for the 
probability that the nucleus starting from A is scattered to position 1 and 
that from B to 2. The amplitude is, say, #(1, 2; A,B). The probability of 
this is p = |(1,2;A,B)|*. Suppose we do not distinguish what kind of 
nucleus arrives at 1, that is, whether it is from A or from B. If it is the 
nucleus from B, the amplitude is ¢(2, 1; A,B) (which equals (1, 2; A,B), 
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Fig. 1-8 Scattering of one nucleus 
by another in the center-of-gravity 
system. The scattering of two iden- 
tical nuclei shows striking interfer- 
ence effects. ‘There are two interfer- 
ing alternatives here. ‘The particle 
which arrives at 1, say, could have 
started either from A or from B. If 
the original nuclei were not identi- 
cal, tests of identity at 1 could de- 
termine which alternative had ac- 
tually been taken, so they are ex- 
clusive alternatives and the special 
interference effects do not arise in 
this case. 





because we have taken a 90° angle). The chance that some nucleus ar- 
rives at 1 and the other at 2 is 


1O(1, 2; A, B)|* + |A(2, 1; A, B)|? = 2p (1.9) 


We have added the probabilities. The cases “A to 1 and B to 2” and 
“A to 2 and B to 1” are exclusive alternatives because we could, if we 
wished, determine the character of the nucleus at 1 without disturbing 
the previous scattering process. | 

But what would happen if both A and B released alpha particles? 
Then no experiment can distinguish which is which, and we cannot 
know whether the nucleus arriving at 1 started from A or B. We have 
interfering alternatives, and the probability is 


}6(1, 2; A, B) + $(2,1;A, B)|° = 4p (1.10) 


This interesting result is readily verified experimentally. 

If electrons scatter electrons, the result is different in two ways. First, 
the electron has a quality we call spin, and a given electron may be 
in one of the two states called spin up and spin down. The spin is 
not changed to first approximation for scattering at low energy. The 
spin carries a magnetic moment. At low velocities the main forces are 
electrical, owing to charge, and the magnetic influences make only a 
small correction, which we neglect. So if the electron from A has spin 
up and the electron from B has spin down, we could later tell which 
arrived at 1 by measuring its spin. If up, it is from A; if down, from B. 
The scattering probability is then 


16(1, 2; A, B)|? + |¢(2, 1; A, B)|? = 2p (it) 


in this case. 
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If, however, electrons at both A and B start with spin up, we cannot 
later tell which is which and we would expect 


}O(1, 2; A, B) + (2, 1;A,B)|* = 4p (1.12) 


Actually this is wrong and, remarkably, electrons obey a different rule. 
The amplitude for an event in which the identity of a pair of electrons 
is reversed contributes 180° out of phase. That is, the case of both spin 
up gives 


}6(1, 2; A,B) — $(2,1;A, B)|° (1.13) 
In our case of 90° scattering (1,2; A, B) = (2,1; A, B), so this is zero. 


Fermions and Bosons. This rule of the 180° phase shift for al- 
ternatives involving exchange in identity of electrons is very odd, and 
its ultimate reason in nature is still only imperfectly understood. Other 
particles besides electrons obey it. Such particles are called fermions, 
and are said to obey Fermi, or antisymmetric, statistics. Electrons, pro- 
tons, neutrons, neutrinos, and u mesons are fermions. So are composites 
of an odd number of these such as a nitrogen atom, which contains seven 
electrons, seven protons, and seven neutrons. This 180° rule was first 
stated by Pauli and is the full quantum-mechanical basis of his exclusion 
principle, which controls the character of the chemists’ periodic table. 

Particles for which interchange does not alter the phase are called 
bosons and are said to obey Bose, or symmetrical, statistics. Examples 
of bosons are photons, 7 mesons, and composites containing an even 
number of Fermi particles such as an alpha particle, which is two pro- 
tons and two neutrons. All particles are either one or the other, bosons 
or fermions. These interference properties can have profound and mys- 
terious effects. For example, liquid helium made of atoms of atomic 
mass 4 (bosons) at temperatures of one or two degrees Kelvin can flow 
without any resistance through small tubes, whereas the liquid made of 
atoms of mass 3 (fermions) does not have this property. 

The concept of identity of particles is far more complete and definite 
in quantum mechanics than it is in classical mechanics. Classically, two 
particles which seem identical could be nearly identical, or identical for 
all practical purposes, in the sense that they may be so closely equal that 
present experimental techniques cannot detect any difference. However, 
the door is left open for some future technique to establish the differ- 
ence. In quantum mechanics, however, the situation is different. We can 
give a direct test to determine whether or not particles are completely 
indistinguishable. 


1-3 Interfering alternatives Ir 


If the particles in the experiment diagramed in Fig. 1-8, starting from 
A and B, were only approximately identical, then improvements in exper- 
imental techniques would enable us to determine by close scrutiny of the 
particle arriving at 1, for example, whether it came from A or B. In this 
situation the alternatives of the two initial positions must be exclusive, 
and there must be no interference between the amplitudes describing 
these alternatives. Now the important point is that this act of scrutiny 
would take place after the scattering had taken place. This means that 
the observation could not possibly affect the scattering process, and this 
in turn implies that we would expect no interference between the ampli- 
tudes describing the alternatives (that it is either the particle from A or 
the particle from B which arrives at 1). In this case we must conclude 
from the uncertainty principle that there is no way, even in principle, 
to ever distinguish between these possibilities. That is, when a particle 
arrives at 1, it is completely impossible by any test whatsoever, now or 
in the future, to determine whether the particle started from A or B. In 
this more rigorous sense of identity, all electrons are identical, as are all 
protons, etc. | 

As a second example we consider the scattering of neutrons from 
a crystal. When neutrons of wavelength somewhat shorter than the 
atomic spacing are scattered from the atoms in a crystal, we get very 
strong interference effects. The neutrons emerge only in certain discrete 
directions determined by the Bragg law of reflection, just as for X-rays. 
The interfering alternatives which enter this example are the alternative 
possibilities that it is one, or another, atom which does the scattering of 
a particular neutron. (The amplitude to scatter neutrons from any atom 
is so small that we need not consider alternatives in which a neutron is 
scattered by two or more atoms.) The waves of amplitude describing the 
motion of a neutron which start from these atoms interfere constructively 
only in certain definite directions. 

Now there is an interesting complication which enters this appar- 
ently simple picture. Neutrons, like electrons, carry a spin, which can 
be analyzed in two states, spin up and spin down. Suppose the scat- 
tering material is composed of an atomic species which has a similar 
spin property, such as carbon-13. In this case an experiment will reveal 
two apparently different types of scattering. It is found that besides 
the scattering in discrete directions, as described in the preceding para- 
graph, there is a diffuse scattering in all directions. Why should this 
be? 

A clue to the source of these two types of scattering is provided by 
the following observation. Suppose all the neutrons which enter the ex- 
periment are prepared with spin up. If the spin direction of the emerging 
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neutrons is analyzed, it will be found that some are up and some are 
down; those which still have spin up are scattered only at the discrete 
Bragg angles, while those whose spin has been changed to down come 
out scattered diffusely in all directions! 

Now in order that a neutron flip its spin from up to down, the law 
of conservation of angular momentum requires that the spin of the scat- 
tering nucleus change from down to up. Therefore, in principle, the 
particular nucleus which was responsible for scattering that particular 
neutron could be determined. We could, in principle, note down before 
the experiment the spin state of all the scattering nuclei in the crystal. 
Then, after the neutron is scattered, we could reinvestigate the crystal 
and see which nucleus had changed its spin from down to up. If no 
crystal nucleus underwent such a change in spin, then neither did the 
neutron, and we cannot tell from which nucleus the neutron actually 
scattered. In this case the alternatives interfere and the Bragg law of 
scattering results. 

If, on the other hand, one crystal nucleus is found to have changed 
spin, then we know that this nucleus did the scattering. There are no 
interfering alternatives. The spherical waves of amplitude which emerge 
from this particular nucleus describe the motion of the scattered neu- 
tron, and only the waves emerging from this nucleus enter into that 
description. In this case there is equal likelihood to find the scattered 
neutron coming out in any direction. 

The concept of searching through all the nuclei in a crystal to find 
which one has changed its spin state is surely a needle-in-the-haystack 
type of activity, but nature is not concerned with the practical difficul- 
ties of experimentation. The important fact is that in principle it is 
possible without producing any disturbance of the scattered neutron to 
determine (in the latter case where the spin states change) which crys- 
tal nucleus actually did the scattering. The existence of this possibility 
means that even if we do not actually carry out this determination, we 
are nevertheless dealing with exclusive (and thus noninterfering) alter- 
natives. | 

On the other hand, the fact that we get interference between alter- 
natives in the situation where the spin of the neutron was not changed 
means that it is impossible, even in principle, to ever discover which par- 
ticular crystal nucleus did the scattering — impossible, at least, without 
disturbing the situation during or before the scattering. 
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SUMMARY OF PROBABILITY CONCEPTS 


Alternatives and the Uncertainty Principle. The purpose of 
this introductory chapter has been to explain the meaning of a probabil- 
ity amplitude and its importance in quantum mechanics and to discuss 
the rules for manipulation of these amplitudes. Thus we have stated 
that there is a quantity called a probability amplitude associated with 
every method whereby an event in nature can take place. For exam- 
ple, an electron going from source S (Fig. 1-1) to a detector at x has 
one amplitude for completing this course while passing through hole 1 
of the screen at C and another amplitude for passing through hole 2. 
Further, we can associate an amplitude with the overall event by adding 
together the amplitudes of each alternative method. Thus, for example, 
the overall amplitude for arrival at x is given in Eq. (1.2) as 


$ = b1 + $2 (1.14) 


Next, we interpret the absolute square of the overall amplitude as 
the probability that the event will happen. For example, the probability 
that an electron reaches the detector is 


P = |¢) + 2° (1.15) 


If we interrupt the course of the event before its conclusion with an 
observation of the state of the particles involved in the event, we disturb 
the construction of the overall amplitude. Thus if we observe the system 
of particles to be in one particular state, we exclude the possibility that 
it can be in any other state, and the amplitudes associated with the 
excluded states can no longer be added in as alternatives in comput- 
ing the overall amplitude. For example, if we determine with the help 
of some sort of measuring equipment that the electron passes through 
hole 1, the amplitude for arrival at the detector is just @,. Further, it 
does not matter if we actually observe and record the outcome of the 
measurement or not, so long as the measurement equipment is working. 
Obviously, we could observe the outcome at any time we wished. The 
operation of the measuring equipment is sufficient to disturb the system 
and its probability amplitude. 

This latter fact is the basis of the Heisenberg uncertainty principle, 
which states that there is a natural limit to the subtlety of any experi- 
ment or the refinement of any measurement. 
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The Structure of the Amplitude. The amplitude for an event is 
the sum of the amplitudes for the various alternative ways that the event 
can occur. This permits the amplitude to be analyzed in many different 
ways depending on the different classes into which the alternatives can 
be divided. The most detailed analysis results from considering that a 
particle going from a to 6, for example, in a given time interval, can 
be considered to have done this by going in a certain motion (position 
vs. time) or path in space and time. We shall therefore associate an 
amplitude with each possible motion. The total amplitude will be the 
sum of a contribution from each of the paths. 

This idea can be made more clear by a further consideration of our 
experiment with the two holes. Suppose we put a couple of extra screens 
between the source and the holes. Call these screens E and D. In each of 
them we drill a few holes which we call E4, E2, ... and D1, Do, ... (Fig. 1- 
9). For simplicity, we shall assume the electrons are constrained to move 
in the zy plane. Then there are several alternative paths which an 
electron may take in going from the source to either hole in screen C. It 
could go from the source to E2, and then Dg, and then the hole 1; or it 
could go from the source to Es, then Dj, and finally to the hole 1; etc. 
Each of these paths has its own amplitude. The complete amplitude is 
the sum of all of them. 





A E D C B 


Fig. 1-9 When several holes are drilled in the screens E and D placed between the 
source at screen A and the final position at screen B, several alternative routes are 
available for each electron. For each of these routes there is an amplitude. The result 
of any experiment in which all of the holes are open requires the addition of all these 
amplitudes, one for each possible path. 
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Fig. 1-10 More and more holes are cut in the screens at ysm and yp. Eventually, the 
screens are completely riddled with holes, and the electron has a continuous range of 
positions, up and down along each screen, at which it can pass through the position 
of the screen. In this case the sum of alternatives becomes a double integral over the 
continuous variables xg and xp describing the alternative heights at which the electron 
passes the position of the screens at ye and yp. 


Next, suppose we continue to drill holes in the screens E and D until 
there is nothing left of the screens. The path of an electron must now be 
specified by the height zg at which the electron passes the position yg at 
the nonexistent screen E, together with the height xp, at the position yp 
as in Fig. 1-10. To each pair of heights there corresponds an amplitude. 
The principle of superposition still applies, and we must take the sum 
(or by now, the integral) of these amplitudes over all possible values of 
LE and x D> | | 

Clearly, the next thing to do is to place more and more screens 
between the source and hole 1 and in each screen drill so many holes that 
there is nothing left. Throughout this process we continue to refine the 
definition of the path of the electron, until finally we arrive at the sensible 
idea that a path is merely height as a particular function of distance, or 
z(y). We also continue to apply the principle of superposition, until we 
arrive at the integral over all paths of the amplitude for each path. 

Now we can make a still finer specification of the motion. Not only 
can we think of the particular path z(y) in space, but we can specify the 
time at which it passes each point in space.° That is, a path will (in our 
two-dimensional case) be given if the two functions z(t), y(t) are given. 
Thus we have the idea of an amplitude to take a certain path z(t), y(t). 
The total amplitude to arrive is the sum or integral of this amplitude 
over all possible paths. The problem of defining this concept of a sum 
or integral over all paths in a mathematically more precise way will be 
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taken up in Chap. 2. 

Chapter 2 also contains the formula for the amplitude for any given 
path. Once this is given, the laws of nonrelativistic quantum mechanics 
are completely stated, and all that remains is a demonstration of the 
application of these laws in a number of interesting special cases. 


SOME REMAINING THOUGHTS 


We shall find that in quantum mechanics, the amplitudes @ are solu- 
tions of a completely deterministic equation (the Schrodinger equation). 
Knowledge of @ at t = 0 implies its knowledge at all subsequent times. 
The interpretation of |ġ|? as the probability of an event is an indeter- 
ministic interpretation. It implies that the result of an experiment is 
not exactly predictable. It is very remarkable that this interpretation 
does not lead to any inconsistencies. That this is true has been amply 
demonstrated by analyses of many particular situations by Heisenberg, 
Bohr, Born, von Neumann, and many other physicists. In spite of all 
these analyses the tact that no inconsistency can arise is not thoroughly 
obvious. For this reason quantum mechanics appears as a difficult and 
somewhat mysterious subject to a beginner. The mystery gradually de- 
creases aS more examples are tried out, but one never quite loses the 
feeling that there is something peculiar about the subject. 

There are a few interpretational problems on which work may still be 
done. They are very difficult to state until they are completely worked 
out. One is to show that the probability interpretation of @ is the only 
consistent interpretation of this quantity. We and our measuring in- 
struments are part of nature and so are, in principle, described by an 
amplitude functions satisfying a deterministic equation. Why can we 
only predict the probability that a given experiment will lead to a def- 
inite result? From what does the uncertainty arise? Almost without 
doubt it arises from the need to amplify the effects of single atomic 
events to such a level that they may be readily observed by large sys- 
tems. The details of this have been analyzed only on the assumption 
that |¢|* is a probability, and the consistency of this assumption has 
been shown. It would be an interesting problem to show that no other 
consistent interpretation can be made.° | 

Other problems which may be further analyzed are those dealing 
with the theory of knowledge. For example, there seems to be a lack 
of symmetry in time in our knowledge. Our knowledge of the past is 
qualitatively different from that of the future. In what way is only the 
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probability of a future event accessible to us, whereas the certainty of 
a past event can often apparently be asserted? These matters again 
have been analyzed to a great extent. Possibly a little more can be 
said to clarify the situation, however. Obviously, we are again involved 
in the consequences of the large size of ourselves and of our measur- 
ing equipment. The usual separation of observer and observed which is 
now needed in analyzing measurements in quantum mechanics should 
not really be necessary, or at least should be even more thoroughly 
analyzed. What seems to be needed is the statistical mechanics of am- 
plifying apparatus.° 

The analyses of such problems are, of course, in the nature of philo- 
sophical questions. They are not necessary for the further development 
of physics. We know we have a consistent interpretation of @ and, almost 
without doubt, the only consistent one. The problem of today seems to 
be the discovery of the laws governing the behavior of @ for phenomena 
involving nuclei and mesons. The interpretation of ¢ is interesting. But 
the much more intriguing question is: What new modifications of our 
thinking will be required to permit us to analyze phenomena occurring 
within nuclear dimensions’? 


THE PURPOSE OF THIS BOOK 


So far, we have given the form the quantum-mechanical laws must take, 
i.e., that a probability amplitude exists, and we have outlined one pos- 
sible scheme for calculating this amplitude. There are other ways to 
formulate this. In a more usual approach to quantum mechanics the 
amplitude is calculated by solving a kind of wave equation. For particles 
of low velocity, it is called the Schrodinger equation. A more accurate 
equation valid for electrons of velocity arbitrarily close to the velocity of 
light is the Dirac equation. In this case the probability amplitude is a 
kind of hypercomplex number. We shall not discuss the Dirac equation 
in this book, nor shall we investigate the effects of spin. Instead, we limit 
our attention to low-velocity electrons, extending our horizon somewhat 
in the direction of quantum electrodynamics by investigating photons, 
particles whose behavior is determined by Maxwell’s equations. 

In this book we shall give the laws to compute the probability ampli- 
tude for nonrelativistic problems in a manner which is somewhat uncon- 
ventional. In some ways, particularly in developing a conceptual under- 
standing of quantum mechanics, it may be preferred, but in others, e.g., 
in making computations for the simpler problems and for understanding 
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the literature, it is disadvantageous. 

The more conventional view, via the Schrödinger equation, is already 
presented in many books, but the views to be presented here have ap- 
peared only in abbreviated form in papers in the journals. A central 
aim of this book is to collect this work into one volume where it may be 
expounded with sufficient clarity and detail to be of use to the interested 
student. 

In order to keep the subject within bounds, we shall not make a com- 
plete development of quantum mechanics. Instead, whenever a topic has 
reached such a point that further elucidation would best be made by con- 
ventional arguments appearing in other books, we refer to those books. 
Because of this incompleteness, this book cannot serve as a complete 
textbook of quantum mechanics. It can serve as an introduction to the 
ideas of the subject if used in conjunction with another book that deals 
with the Schrödinger equation, matrix mechanics, and applications of 
quantum mechanics. | 

On the other hand, we shall use the space saved (by our not develop- 
ing all of quantum mechanics in detail) to consider the application of the 
mathematical methods used in the formulation of quantum mechanics 
to other branches of physics (Chaps. 10-12). 

It is a problem of the future to discover the exact manner of comput- 
ing amplitudes for processes involving the apparently more complicated 
particles, namely, neutrons, protons, and mesons. Of course, one can 
doubt that, when the unknown laws are discovered, we shall find our- 
selves computing amplitudes at all. However, the situation today does 
not seem analogous to that preceding the discovery of quantum mechan- 
ics. 

In the 1920’s there were many indications that the fundamental the- 
orems and concepts of classical mechanics were wrong, i.e., there were 
many paradoxes. General laws could be proved independently of the 
detailed forces involved. Some of these laws did not hold. For exam- 
ple, each spectral line showed a degree of freedom for an atom, and at 
temperature T each degree of freedom should have an energy kT, con- 
tributing R to the specific heat. Yet this very high specific heat expected 
from the enormous number of spectral lines did not appear. 

Today, any general law that we have been able to deduce from the 
principle of superposition of amplitudes, such as the characteristics of 
angular momentum, seems to work. But the detailed interactions still 
elude us. This suggests that amplitudes will exist in a future theory, but 
their method of calculation may be strange to us. 


'R.P. Feynman, Space-Time Approach to Non-relativistic Quantum Mechanics, 
Rev. Mod. Phys., vol. 20, pp. 367-387, 1948. 
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IN this chapter we intend to complete our specification of nonrelativistic 
quantum mechanics which we began in Chap. 1. There we noted the 
existence of an amplitude for each trajectory; here we shall give the form 
of the amplitude for each trajectory. For a while, for simplicity, we shall 
restrict ourselves to the case of a particle moving in one dimension. Thus 
the position at any time can be specified by a coordinate xz, a function 
of time t. By the path, then, we mean a function z(t). 

If a particle at an initial time ta starts from point £a and goes to a 
final point xz, at time tp, we shall say simply that the particle goes from 
a to b and our function x(t) will have the property that x(t.) = £a and 
r(t) = zy. In quantum mechanics, then, we shall have an amplitude, 
often called° a kernel, which we may write K (b, a), to get from the point 
a to the point b. This will be the sum over all of the trajectories that 
go between the end points a and b of a contribution from each. This 
is to be contrasted with the situation in classical mechanics in which 
there is only one specific and particular trajectory which goes from a to 
b, the so-called classical trajectory, which we shall label Z(t). Before we 
go on to give the rule for the quantum-mechanical case, let us remind 
ourselves of the situation in classical mechanics. 


THE CLASSICAL ACTION 


One of the most elegant ways of expressing the condition that determines 
the particular path Z(t) out of all the possible paths is the principle of 
least action. That is, there exists a certain quantity S which can be 
computed for each path. The classical path z(t) is that for which S is a 
minimum. Actually, the real condition is that S be merely an extremum. 
That is to say, the value of S is unchanged in the first order if the path 
z(t) is modified slightly. 
The quantity S is given by the expression 
tb 

hole Datat (2.1) 

ta 
where L is the lagrangian for the system. For a particle of mass m 
subject to a potential energy V (x,t), which is a function of position and 
time, the lagrangian is 


ide =e —V(z,t) (2.2) 


The form of the extremum path z(t) is determined through the usual 
procedures of the calculus of variations. Thus, suppose the path is varied 
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away from ž(t) by an amount z(t); the condition that the end points 
of Z(t) are fixed requires 

Salta) Sort) =0 (2.3) 
The condition that Z(t) be an extremum of S means 

65 = SIZ + da] — S|[z] =0 (2.4) 


to first order in 62(t). Using the definition of Eq. (2.1) we may write 


tp 
Sle + ôa] = | L(t + ôt, 2 + ôx, t) dt 
ta 


a es, OL OL 
Z J. L, x,t) + bb a + io dt 
OE ðL 
= S|ax) + [ eas T i | dt (2.5) 
Upon integration by parts, the variation in S becomes 
oL (tayo, OL 
0S = oa N — [ Ox E (E) — E dt (2.6) 


Since x(t) is zero at the end points, the first term on the right-hand 
side of the equation is zero. Between the end points dx(t) can take on 
any arbitrary value. Thus the extremum is that curve along which the 
following condition is always satisfied: 


d (OL OL 
a (ae) ae 7° (2.7) 


This is, of course, the classical lagrangian equation of motion. 

In classical mechanics, the form of the action integral S = f Ldt is 
interesting, not just the extreme value Sa. This interest derives from 
the necessity to know the action along a set of neighboring paths in 
order to determine the path of least action. 

In quantum mechanics both the form of the integral and the value of 
the extremum are again important. In the following problems we shall 
evaluate the extremum in a variety of situations. 


Problem 2-1 For a free particle L = (m/2)z*. Show that the 
action Se, corresponding to the classical motion of a free particle is 
M (Xp — Ta)” 


So ee 2.8 
2 ty — ta ( ) 
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Problem 2-2° For a harmonic oscillator L = (m/2) (t? — w*2?). 
With T equal to tẹ — ta, show that the classical action is 
MW 


Sea = Fan opt + £7) coswT — 22p2q) (2.9) 


Problem 2-3° Find Sa for a particle under a constant force f, that 
is, L = (m/2)z%? + fa. 


Problem 2-4 Classically, the momentum is defined as 
OL 





2P 2.10 
Da (2.10) 
Show that the momentum at a final point is 

OL OSa 

n = 2.11 
( Ox ee a Ox» ( 


while the momentum at an initial point is 


OL Si Sc 
Oa Jasni OG 


Hint: Consider the effect on Eq. (2.6) of a change in the end points. 





Problem 2-5 Classically, the energy is defined as 








E = tp- L (2.12) 
Show that the energy at a final point is 
ty (S _ L(x») = — = (2.13) 
while the energy at an initial point is 
Sa 

Ota 


Hint: A change in the time of an end point requires a change in path, 
since all paths must be classical paths. 


THE QUANTUM-MECHANICAL AMPLITUDE 


Now we can give the quantum-mechanical rule. We must say how much 
each trajectory contributes to the total amplitude to go from a to b. 
It is not just the particular path of extreme action that contributes; 
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rather, all the paths contribute. They contribute equal magnitudes to 
the total amplitude, but contribute at different phases. The phase of the 
contribution for a given path is the action S for that path in units of the 
quantum of action h. That is, to summarize: The probability P(b, a) to 
go from a point £a at time ta to the point x, at time tẹ is the absolute 
square P(b,a) = |K(b,a)|* of an amplitude K(b,a) to go from a to b. 
This amplitude is the sum of contributions ¢[z(t)] from each path. 


K(b,a)= X ok (2.14) 


paths from a to b 


The contribution of a path has a phase proportional to the action S: 
¢|[x(t)] = const e/? SIO)! (2.15) 


The action is that for the corresponding classical system (see Eq. 2.1). 
The constant will be chosen to normalize K correctly, and it will be taken 
up later (Sec. 2.4) when we discuss more mathematically just what we 
mean in Eq. (2.14) by a sum over paths. 


THE CLASSICAL LIMIT 


Before we go on to making the mathematics more complete, we shall 
compare this quantum law with the classical rule. At first sight, from 
Eq. (2.15) all paths contribute equally, although their phases vary, so 
it is not clear how, in the classical limit, some particular path becomes 
most important. The classical approximation, however, corresponds to 
the case that the dimensions, masses, times, etc., are so large that S is 
enormous in relation to h (= 1.05x10~?" erg-sec). Then the phase of the 
contribution S/h is some very, very large angle. The real (or imaginary) 
part of @ is the cosine (or sine) of this angle. This is as likely to be 
plus as minus. Now if we move the path as shown in Fig 2-1 by a small 
amount dz, small on the classical scale, the change in S is likewise small 
on the classical scale, but not when measured in the tiny units of A. 
These small changes in path will, generally, make enormous changes in 
phase, and our cosine or sine will oscillate exceedingly rapidly between 
plus and minus values. The total contribution will then add to zero; for 
if one path makes a positive contribution, another infinitesimally close 
(on a classical scale) makes an equal negative contribution, so that no 
net contribution arises. 
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Fig. 2-1 The classical path 1, Z(t), is that for which a certain integral, the action S, 
is minimum. If the path is varied by 6z(t), to path 2, the integral suffers no first-order 
change. This determines the classical equation of motion. 

In quantum mechanics, the amplitude to go from a to b is the sum of amplitudes for 
each interfering alternative path. The amplitude for a given path, etS/P_ has a phase 
proportional to the action. 

If the action is very large compared to ñ, neighboring paths such as 3 and 4 have 
slightly different actions — slightly different on a classical scale. Such paths will (be- 
cause of the smallness of A) have very different phases. Their contributions will cancel 
out. Only in the vicinity of the classical path z(t), where the action changes little 
when the path varies, will neighboring paths, such as 1 and 2, contribute in the same 
phase and constructively interfere. That is why the approximation of classical physics 
— that only the path Z(t) need be considered — is valid when the action is very large 
compared to A. 


Therefore, no path really needs to be considered if the neighboring 
path has a different action; for the paths in the neighborhood cancel 
out the contribution. But for the special path Z(t), for which S is an 
extremum, a small change in path produces, in the first order at least, 
no change in S. All the contributions from the paths in this region are 
nearly in phase, at phase S,,/h, and do not cancel out. Therefore, only 
for paths in the vicinity of Z(t) can we get important contributions, and 
in the classical limit we need only consider this particular trajectory as 
being of importance. In this way the classical laws of motion arise from 
the quantum laws. 

We may note that trajectories which differ from Z(t) contribute as 
long as the action is still within about fA of Se. The classical trajectory 
is indefinite to this slight extent, and this rule serves as a measure of 
the limitations of the precision of the classically defined trajectory. 

Next consider the dependence of the phase on the position of the end 
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point (£», tẹ). If we change the end point a little, this phase changes a 
great deal, and K (b,a) changes very rapidly. If by a “smooth function” 
we mean one like Sa which changes only when changes in argument 
which are appreciable on a classical scale are made, we note that K (b, a) 
is far from smooth, but in this classical approximation our arguments 
show that it is of the form 


K(b, a) = “smooth function” - e’«/” (2.16) 


All these approximate considerations apply to a situation on a scale 
for which classical physics might be expected to work (S >> A). But at 
an atomic level, S may be comparable with h, and then all trajectories 
must be added in Eq. (2.14) in detail. No particular trajectory is of 
overwhelming importance, and of course Eq. (2.16) is not necessarily a 
good approximation. To deal with such cases, we shall have to find out 
how to carry out such sums as are implied by Eq. (2.14). 


THE SUM OVER PATHS 


Analogy with the Riemann Integral. Although the qualitative 
idea of a sum of a contribution for each of the paths is clear, a more pre- 
cise mathematical definition of such a sum must be given. The number 
of paths is a high order of infinity, and it is not evident what measure 
is to be given to the space of paths. It is our purpose in this section to 
give such a mathematical definition. This definition will be found rather 
cumbersome for actual calculation. In the succeeding chapters we shall 
describe other and more efficient methods of computing the sum over all 
paths. As for this section, it is hoped that the mathematical difficulty, or 
rather inelegance, will not distract the reader from the physical content 
of the ideas. | 

We can begin our understanding with a consideration of the ordinary 
Riemann integral. We could say, very roughly, that the area A under 
a curve is the sum of all its ordinates. Better, we could say that it is 
proportional to that sum. But to make the idea precise, we do this: take 
a subset of all ordinates (e.g., those spaced at equal intervals A). Adding 
these ordinates, we obtain 


Aw D f(z) | (2.17) 


where the summation is carried out over the finite set of points 2;, as 
shown in Fig. 2.2. 
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Fig. 2-2 In the definition of the ordinary Riemann integral, a set of ordinates is 
drawn from the abscissa (the z-axis) to the curve. The ordinates are spaced a distance 
h apart. The integral (area between the curve and the abscissa) is approximated by h 
times the sum of the ordinates. This approximation approaches the correct value as h 
approaches zero. 

An analogous definition can be used for path integrals. The measure which goes to 
zero in the limit process is the time interval € between discrete points on the paths. 


The next step is to define A as the limit of this sum as the subset of 
points, and thus the subset of ordinates, becomes more complete or — 
because a finite set is never any measurable part of the infinite continuum 
— we may better say as the subset becomes more representative of the 
complete set. We can pass to the limit in an orderly manner by taking 
continually smaller and smaller values of h. In so doing, we would obtain 
a different sum for each value of h. No limit exists. In order to obtain 
a limit to this process, we must specify some normalizing factor which 
should depend on h. Of course, for the Riemann integral, this factor is 
just A itself. Now the limit exists and we may write the expression 


A=lim Ay re) (2.18) 


Constructing the Sum. We can follow through an analogous pro- 
cedure in defining the sum over all paths. First, we choose a subset of 
all paths. To do this, we divide the independent variable time into steps 
of width e. This gives us a set of values t; spaced an interval € apart be- 
tween the values t, and tẹ. At each time t; we select some special point 
£i. We construct a path by connecting all the points so selected with 
straight lines. It is possible to define a sum over all paths constructed 
in this manner by taking a multiple integral over all values of x; for 2 
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between 1 and N — 1, where 


Ne=ty—te 
6S t= FG (2.19) 
to = ta tn = to 

=a EN ei, 


The resulting equation is 


K(b, a) ~f ff olet )| dx, d£2 -d£ N1 (2.20) 


We do not integrate over zo or zy because these are the fixed end 
points £a and zp. This equation corresponds formally to Eq. (2.17). 
In the present case we can obtain a more representative sample of the 
complete set of all possible paths between a and b by making e smaller. 
However, just as in the case of the Riemann integral, we cannot proceed 
to the limit of this process because the limit does not exist. Once again 
we must provide some HOAS AE factor which we expect will depend 
upon e€. 

Unfortunately, to define a a normalizing factor seems to be a very 
difficult problem and we do not know how to do it in general. But we do 
know how to give the definition for all situations which so far seem to 
have practical value. For example, take the case where the lagrangian is 
given by Eq. (2.2). The normalizing factor turns out to be A~”, where 


a 1/2 | | 
he eo (2.21) 


m 





We shall see later (in Sec. 4-1) how this result is obtained. With this 
factor the limit exists° and we may write 





a a) = ) = lim = | [fe G/R)SIb,a] SZL T. egi N- - (2.22) 
where 
to 
S{b, a] = J eaid (2.23) 
ta 


is a line integral taken over the trajectory passing through the points z; 
with straight sections between, as in Fig. 2-3. 


Qo 
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Fig. 2-3 The sum over paths is defined as 
a limit, in which at first the path is spec- 
ified by giving only its coordinate x at a 
large number of specified times separated by 
very small time intervals e. The path sum is 
then an integral over all these specific coordi- 
nates. Then to achieve the correct measure, 
the limit is taken as € approaches 0. 


= TY 
Qe 
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It is possible to define the path in a somewhat more elegant manner. 
Instead of straight lines between the points 7 and 7+ 1, we could use 
sections of the classical orbit. Then we could say that S is the minimum 
value of the integral of the lagrangian over all the paths which go through 
the specified points (;,t;). With this definition no recourse is made to 
arbitrary straight lines. 


The Path Integral. There are many ways to define a subset of 
all the paths between a and b. The particular definition we have used 
here may not be the best for some mathematical purposes. For example, 
suppose the lagrangian depends upon the acceleration of x. In the way 
we have constructed the path, the velocity is discontinuous at the various 
points (x;,t;); that is, the acceleration is infinite at these points. It is 
possible that this situation would lead to trouble. However, in the few 
such examples with which we have had experience the substitution 


1 
= zz (Ti+ = i) (2.24) 


N: 


has been adequate. There may be other cases where no such substitution 
is available or adequate, and the present definition of a sum over all paths 
is just too awkward to use. 

A similar situation arises in ordinary integration, where sometimes 
the Riemann definition, Eq. (2.18), is not adequate and recourse must 
be had to some other definition, such as that of Lebesgue. The need 
to redefine the method of integration does not destroy the concept of 
integration. So we feel that the possible awkwardness of the special 
definition of the sum over all paths (as given in Eq. 2.22) may eventu- 
ally require new definitions to be formulated. Nevertheless, the concept 
of the sum over all paths, like the concept of an ordinary integral, is 
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independent of a special definition and valid in spite of the failure of 
such definitions. Thus we shall write the sum over all paths in a less 
restrictive notation as 


b 
K(b, a) =} et/h)51ba] 1) 7 (¢) (2.25) 


which we shall call a path integral. The identifying notation in this 
expression is the script D. Only rarely shall we return to the form given 
in Eq. (2.22). E 


Problem 2-6 The class of functionals for which path integrals can 
be defined is surprisingly varied. So far we have considered functionals 
such as that given in Eq. (2.15). Here we shall consider quite a different 
type. This latter type of functional arises in a one-dimensional relativis- 
tic problem. Suppose a particle moving in one dimension can go only 
forward or backward at the velocity of light. For convenience, we shall 
define the units such that the velocity of light, the mass of the particle, 
and Planck’s constant are all unity. Then in the xt plane all trajectories 
shuttle back and forth with slopes of +45°, as in Fig. 2-4. The amplitude 
for such a path can be defined as follows: Suppose time is divided into 
small equal steps of length e. Suppose reversals of path direction can 
occur only at the boundaries of these steps, i.e., at t = ta + ne, where 
n is an integer. For this relativistic problem the amplitude to go along 
such a path is different from the amplitude defined in Eq. (2.15). The 
correct definition for the present case is 


$ = (ie)* 226) 


where R is the number of reversals, or corners, along the path. 


Fig. 2-4 The path of a relativistic parti- 
cle traveling in one dimension is a zigzag of 
straight segments. The slope of the segments 
is constant in magnitude and differs only in 
sign from zig to zag. The amplitude for a 
particular path, as well as the kernel to go 
from a to b, depends on the number of cor- 
ners R along a path, as shown by Eqs. (2.26) 


t 


and (2.27). 
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As a problem, the reader may use this definition to calculate the 
kernel K(b,a) by adding together the contribution for the paths of one 
corner, two corners, etc. Thus determine 


K(b,a) = X` N(R) (ie)? (07) 
R 


where N(R) is the number of paths possible with R corners. It is best to 
calculate four separate K’s, namely, the amplitude K++ (b,a) of starting 
at the point a with a positive velocity and coming into the point b with a 
positive velocity, the amplitude K+- (b,a) of starting at the point a with 
a negative velocity and coming into the point b with a positive velocity, 
and the amplitudes K_+(b,a) and K__(b, a) defined in a similar fashion. 

Next suppose the unit of time is defined as h/mc?. If the time 
interval is very long [tp — ta > h/mc?] and the average velocity is small 
[£p— £a K c(ty—t,)|, show that the resulting kernel is approximately the 
same as that for a nonrelativistic free particle (given in Eq. 3.3), except 
for a factor exp{—(z/h)mc?(t, — ta)}. The definition given here for the 
amplitude, and the resulting kernel, is correct for a relativistic theory of 
a free particle moving in one dimension. The result is equivalent to the 
Dirac equation for that case. 


EVENTS OCCURRING IN SUCCESSION 


The Rule for Two Events. In this section we shall derive an 
important law for the composition of amplitudes for events which occur 
successively in time. Suppose te is some time between ta and ty. Then 
the action along any path between a and b can be written as 


S|b, a] = Sfb, c] + Sic, al (2.28) 


This follows from the definition of the action as an integral in time and 
also from the fact that L does not depend on derivatives higher than 
the velocity. (Otherwise, we would have to specify values of velocity and 
perhaps higher derivatives at point c.) Using Eq. (2.25) to define the 
kernel, we can write 


b 
ih ihe | eli/h)SIb,]+(i/A)Slesa] Doft) (2.29) 
It is possible to split any path into two parts. The first part would 


have the end points £a and ze = x(t,), and the second part would have 
the end points ze and Zp, as shown in Fig. 2.5. It is possible to integrate 
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Fig. 2-5 One way the sum over all paths 
can be taken is by first summing over paths 
which go through the point at x, and time te 
and later summing over the points ze. The 
amplitude on each path that goes from a to 
b via c is a product of two factors: (1) an 
amplitude to go from a to c and (2) an am- 
plitude to go from c to b. This is therefore 
valid also for the sum over all paths through 
c: the total amplitude to go from a to b via c 
is K(b,c)K(c,a). Thus summing over the al- 
ternatives (values of ze), we get for the total 
amplitude to go from a to b, Eq. (2.31). 





over all paths from a to c, then over all paths from c to b, and finally 
integrate the result over all possible values of ze. In performing the first 
step of this integration S[b, c] is constant. Thus the result can be written 


as 
| CO b 
K(b,a) = / | o(t/*) S14) K (e, a) Delt) dze (2.30) 


where integrations must now be carried out not only over paths between 
c and b but also over the variable end point ze. In the next step we carry 
out the integration over all paths between some point with an arbitrary 
£e and the point b. All that is left is an integral over all possible values 
of ze. Thus 


K(b,a) = L K(b,c)K(c, a) dze (2.31) 


Perhaps the argument is clearer starting from Eq. (2.22). Select one 
of the discrete times as te. Thus let te = tk and £e = £k. First carry out 
all the integrations over those x; such that i < k. This will introduce 
the factor in the integral K(c,a). Next carry out the integrals over all 
those x; such that i > k. This introduces the factor K (b,c). All that is 
left is an integral over £e. The result can be written as Eq. (2.31). 

This result can be summarized in the following way. All alternative 
paths from a to b can be labeled by specifying the position x, through 
which they pass at time te. Then the kernel for a particle going from a 
to b can be computed from the rules: 


1. The kernel to go from a to b is the sum, over all possible 
values of ze, of amplitudes for the particle to go from a 
to c and then to 6. 
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2. The amplitude to go from a to c and then to 0 is the kernel 
to go from a to c times the kernel to go from c to b. 


Thus we have the rule: Amplitudes for events occurring in succession 
in time multiply. 


Extension to Several Events. There are many applications for 
the important rule, and several will be developed in succeeding chapters. 
Here we shall show the application wherein we follow an alternative route 
in deriving the equation for the kernel, Eq. (2.22). 

It is perfectly possible to make two divisions in all the paths: one at 
te and the other at, say, tg. Then the kernel for a particle going from a 
to b can be written 


K(b,a) = / j J E TA (2.32) 


This means that we look at a particle which goes from a to b as if it 
went first from a to d, then from d to c, and finally from c to b. The 
amplitude to follow such a path is the product of the kernels for each 
part of the path. The kernel taken over all such paths that go from a to 
b is obtained by integrating this product over all possible values of xg 
and ze. 

We can continue this process until we have the time scale divided 
into N intervals. The result is- 


K(b,a) “hie g K(b,N —1)K(N —1,N —2)-» 
TN —1 LT2Y Lı 
(¢+1,2)---K(1,a) da; dr2---drn_1 (2.33) 


This means that we can define the kernel in a manner different from that 
given in Eq. (2.22). In this alternative definition the kernel for a particle 
to go between two points separated by an infinitesimal time interval e€ is 


K(i+1,i)= gopi gel (THE Barit eth (2.34) 


which is correct to first order in e. Then by the rules for multiplying the 
amplitudes of events which occur successively in time, we have 


N—1 


é[2(t)] = lim lI K(i +1,i) | (2.35) 


for the amplitude of a complete path. Then, using the rule that am- 
plitudes for alternative paths add, we arrive at a definition for K (b,a). 
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It can be seen that the resulting expression is actually the same as 


Eq. (2.22). 


a iia REMARKS 


In the relativistic theory of the electron we shall not find it possible 
to express the amplitude for a path as e*°/", or in any other simple 
way. However, the laws for combining amplitudes still work (with some 
small modifications). The amplitude for a trajectory still exists. As a 
matter of fact, it is still given by Eq. (2.35). The only difference is that 
K(i + 1,2) is not so easily expressed in a relativistic theory as it is in 
Eq. (2.34). The complications arise from the necessity to consider spin 
and the possibility of the production of pairs of electrons and positrons. 

In nonrelativistic systems with a larger number of variables, and 
even in the quantum theory of electromagnetic field, not only do the 
laws for combining amplitudes still hold but the amplitude itself follows 
the rules set down in this chapter. That is, each motion of a variable 
has an amplitude whose phase is 1/h times the action associated with 
the motion. 

We shall take up these more complicated examples in later chapters. 


Developing the Concepts 
with Special Examples 


IN this chapter we shall develop the kernels governing some special types 
of motion. We shall explore the physical meaning of the mathematical 
results in order to develop some physical intuition about motion under 
quantum-mechanical laws. The wave function will be introduced and its 
relation to the kernel will be described. This represents the first step 
in connecting the present approach to quantum mechanics with more 
traditional approaches. 

We shall also introduce some special mathematical methods for com- 
puting the sum over all paths. The idea of a sum over all paths was de- 
scribed in Chap. 2 with the help of a particular computational method. 
Although that method may clarify the concept, it is an awkward tool 
with which to work. The simpler methods introduced in this chapter 
will be of great use in our future work. 

Thus the present chapter has three purposes: deepening our under- 
standing of quantum-mechanical principles, beginning the connection 
between our present approach and other approaches, and introducing 
some useful mathematical methods. 


THE FREE PARTICLE 


The Path Integral. The method used in Chap. 2 to describe a sum 
over all paths will be used here to compute the kernel for a free particle. 
The lagrangian for a free particle is 


m 
bom A 
Ma (3.1) 
Thus with the help of Eqs. (2.21) to (2.23) the kernel for a free particle 
(distinguished by the subscript 0) is 


(e) (3.2) 
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This represents a set of gaussian integrals, i.e., integrals of the form 
T e722" dg or Je "+02 dg. Since the integral af a gaussian is again a 
gaussian, we may carry out the integrations on one variable after the 
other. After the integrations are completed, the limit may be taken. 


The result is 

1/2 ; 2 
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The calculation is carried out as follows. Notice first that 


m \2/2 f” im 
l exp ——|(x2 = £1)? T Gal aa zo)“] dx, = 
2rihe M 2Qhe 








m 1/2 im 5 
o y) on E . 2e (t2 — Zo) (3.4) 
Next we multiply this result by 
m \1/2 im 9 
Cnn, = | T (ea ~ 2) j (3.5) 


and integrate again, this time over zə. The result is similar to that of 
Eq. (3.4), except that (£2 — £0)” becomes (x3 — zo)? and the expression 
2e is replaced by 3e in two places. Thus we get 


( m 2 im EN 
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In this way a recursion process is established which, after n — 1 steps, 
gives 


( m ye im E 
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Since ne = tn — to, it is easy to see that the result after N — 1 steps is 
identical with Eq. (3.3). 

There is an alternative procedure. Equation (3.4) can be used to 
integrate over all variables x; for which 7 is odd (assuming N is even). 
The result is an expression formally like Eq. (3.2), but with half as many 
variables of integration. The remaining variables are defined at points 
in time spaced a distance 2e apart. Hence, at least in the case that N 
is of the form 2”, Eq. (3.3) results from k steps of this kind. 


Problem 3-1 The probability that a particle arrives at the point b 
is by definition proportional to the absolute square of the kernel K (b,a). 
For the free-particle kernel of Eq. (3.3) this is 


m 

P(b) dz = Ae — ta) dx | (3.6) 
Clearly this is a relative probability, since the integral over the complete 
range of x diverges. What does the particular normalization mean? 
Show that this corresponds to a classical picture in which a particle 
starts from the point a with all momenta equally likely. Show that the 
corresponding relative probability that the momentum of the particle 
lies in the range dp is dp/27h. 
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Fig. 3-1 The real part of v'i times amplitude to arrive at various distances x from the 
origin after a given time t. The imaginary part (not shown) is an analogous wave 90° 
out of phase, so that the absolute square of the amplitude is constant. ‘The wavelength 
is short at large x, namely, where a classical particle could arrive only if it moved with 
high velocity. Generally, the wavelength and classical momentum are inversely related 
(see Eq. 3.10). 


Momentum and Energy. We now study some of the implications 
of the free-particle kernel. For convenience let the point a represent the 
origin in both time and space. The amplitude to go to some other point 
b = (x,t) is 


m \1/2 ima? 
KOR N= aa exp { Dit (3.7) 


If time is fixed, the amplitude varies with distance as shown in Fig. 3-1, 
in which the real part of Viko(a, t,;0,0) is plotted. 

We see that as we get farther from the origin the oscillations become 
more and more rapid. If x is so large that many oscillations have oc- 
curred, then the distance between successive nodes is nearly constant, 
at least for the next few oscillations. That is, the amplitude behaves 
much like a sine wave of slowly varying wavelength 4. Changing x by À 
must increase the phase of the amplitude by 27. That is, 
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Neglecting the quantity à? relative to xà (that is, assuming x >> A), we 
find 
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(3.8) 
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From a classical point of view a particle which moves from the origin 
to x in the time interval t has a velocity z/t and a momentum mg/t. 
From the quantum-mechanical point of view, when the motion can be 
adequately described by assigning a classical momentum to the particle 
of p= ma/t, then the amplitude varies in space with the wavelength 


_h 
p 


We may show this relation still more generally. Suppose we have 
some large piece of apparatus, such as a magnetic analyzer, which is 
supposed to bring particles of a given momentum p to a given point. We 
shall show that, whenever the apparatus is large enough that classical 
physics offers a good approximation, then the amplitude for a particle to 
arrive at the prescribed point varies rapidly in space with a wavelength 
equal to h/p. For as we have seen, in such a situation, the kernel is 
approximated by 


a (3.10) 
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Changes in the position of the final point x, cause variations in the 
classical action. If this action is large compared to A (the semiclassical 
approximation), the kernel will oscillate very rapidly with changes in Zp. 
The change in phase per unit displacement of the end point is 


1 OS 
on h OLp 


but Se /3x, is the classical momentum of the particle when it arrives 
at the point z, (see Prob. 2-4). Thus p = hk. This quantity k, the phase 
change per unit distance of a wave, is called the wave number, and it is 
very convenient to use. Since the wavelength is the distance over which 
the phase changes by 2r, then k = 2r/A. Equation (3.12) is de Broglie’s 
formula relating the momentum to the wave number of a wave, p = Ak. 

Next, let us study the time dependence of the free-particle kernel 
given by Eq. (3.7). Suppose we hold the distance fixed and vary the 
time. The variation of the real part of vi times the kernel is shown in 
Fig. 3-2. Both frequency and amplitude change with t. 





(3.12) 
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Fig. 3-2 The amplitude 
to find the particle at 
a given point varies with 
time. The real part of 
ViKo is plotted here. The 
frequency of the oscilla- 

t tions is proportional to the 
energy that a classical par- 
ticle would have to have to 
arrive at the point in ques- 
tion within the time inter- 
val t. 





Suppose t is very large and neglect the change in amplitude with 
variations of t. The period of oscillation T is defined as the time required 
to decrease the phase by 27. Thus 
a OA 

2ht 2A(t+T) 2ht? \14+T7/t 
By introducing the angular frequency w = 27/T, and assuming t > T, 
we can write this equation as 
mE? 
wa = (=) (3.14) 
Since (m/2)(x/t)? is the classical energy of a free particle, this equation 
Says 


Energy = hw (3.15) 


This relation, like the one relating momentum and wavelength, holds 
for any apparatus which can be adequately described by classical physics; 
and, like the previous relation, it can be obtained from a more general 
argument. | | 

Referring to Eq. (3.11), any variation of the time ty of an end point 
will cause a rapid oscillation of the kernel. The resulting frequency is 

1 ðS 
oaae Bt, (3.16) 
The quantity —OS.,/Ot» is interpreted classically as the energy E (refer 
to Prob. 2-5). Thus 


(3.13) 





was (3.17) 


3-2 Diffraction through a slit 4T 


In this way the concepts of momentum and energy are extended to 
quantum mechanics by the following rules: 


1. If the amplitude varies in space as e’**, we say that the 
particle has momentum Ak. 

2. If the amplitude varies in time as e~“*, we say that the 
particle has energy hw. 


We have just shown that these rules agree with the usual definitions of 
energy and momentum in the classical limit. 


Problem 3-2 Show by substitution that the free-particle kernel 
Ko(b, a) satisfies the differential equation 


OKo(b,a) i - he 0° Kolb, 2] 


Al 2m Ox; 


= > (3.18) 


whenever ty is greater than ta. 


DIFFRACTION THROUGH A SLIT” 


The Conceptual Experiment. We can learn more about the phys- 
ical interpretation of quantum mechanics and its relation to classical me- 
chanics by considering another, somewhat more complicated, example. 
Suppose a free particle is liberated at t = 0 from the origin and then, at 
an interval of time T later, we observe that it is at a certain point X. 
Classically, we would say that the particle has had a velocity V = X/T. 
The implication would be that if a particle were left alone to continue 
for another interval of time t’, it would move an additional distance 
x' = Vt. To analyze this quantum-mechanically, we shall attempt to 
solve the following problem: 

At t = 0 the particle starts from the origin x = 0. After an interval T 
we shall suppose that it is known that the particle is within the distance 
+b of X. We ask: After an additional interval t’, what is the probability 
of finding the particle at an additional displacement x’ from the position 
X? The net amplitude to arrive at this position z’ at the time T+ t’ 
can be considered as the sum of contributions from every trajectory that 
goes from the origin to the final point, provided that that trajectory lies 
in the interval +b from X at the time T. 

We shall calculate this in a moment, but first it is worth remarking 
on what kind of an experiment we are contemplating here. How can 
we know that the particle passes the point X within the interval +b? 
One way would be to make an observation of the particle at the time 
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Fig. 3-3 A particle starting at x = 0 when t = 0 is determined to pass between X — b 
and X +b at t = T. We wish to calculate the probability of finding the particle at 
some point x at a time t’ seconds later, i.e., when t = T + t. According to classical 
laws, the particle would have to be between (X — b)(1+?'/T) and (X + d)(1+1t'/T), 
that is, between the rectilinear extensions of the original slit. However, quantum- 
mechanical laws show that such particles have nonzero probability of appearing outside 
these classical limits. 

We cannot approach this problem by a single application of the free-particle law of 
motion, since the particle is actually constrained by the slit. So we break the problem 
up into two successive free-particle motions. ‘The first takes the particle from x = 0 
att = 0 to x = X + y at t = T, where |y| < b. The second takes the particle from 
x = X +yatt= T toz att= T +t. The overall amplitude is an integral of the 
product of these two free-particle kernels, as shown in Eq. (3.19). 


T 


T to see if it is within the interval +b This would be the most natural 
way to proceed, but it is somewhat more difficulty to analyze in detail 
(because of the complicated interaction between the particle and the 
observing mechanism) than another way of doing the experiment. 

Suppose we look, say, with very strong light, everywhere all along the 
x axis except within +b from the point X at the time T. If we find the 
particle, we discontinue the experiment. We consider only those cases 
in which a thorough investigation of the region, except for the region 
+b, shows no particle is present. That is, all trajectories which pass 
outside the limits +b from X are rejected. The experimental situation 
is diagramed in Fig. 3-3. The amplitude then can be written as 


b 
p(x’) =| KIX FET PX yt KX by, T 00)dy (3.19) 
—b 


This expression is written down in accordance with the rule for com- 
bining amplitudes of events occurring in succession in time (Sec. 2-5). 
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The first event is that the particle goes from the origin to the slit. The 
second event is that the particle proceeds from the slit to the point 2’ 
further on. The slit has a finite width, and passage through each ele- 
mental interval of the slit represents and alternative way of proceeding 
along the complete path. Thus we must integrate over the width of the 
slit. All particles which miss the slit are captured and removed from 
the experiment. Amplitudes for such particles are not included. All the 
particles which get through the slit move as free particles with kernels 
given by Eq. (3.3). Thus the amplitude is | 


b l r IO 
Ae J, Ga í = een 7 
x ( dki i exp [E] dy (3.20) 
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This integral can be expressed in terms of Fresnel integrals. Such 
a representation contains the physical results we are after, but in an 
obscure way owing to the mathematical complexity of the Fresnel in- 
tegral form. Rather than confuse the physical results by mathematical 
complexity, we shall set up a different, but analogous, expression which 
leads to a simpler mathematical form. 


The Gaussian Slit. Suppose we introduce a function G(y) as a 
factor in the integrand. If this function is defined as 


= J1 fox —b<y<b 
ow) = {6 for |y| >b 


the limits of integration can be extended to infinity without any change 
in the result. Then 


(3.21 


Instead of this, suppose we define G(y) to be a gaussian function, 
thus: 


Gy) =e 0/2 (3.22) 


This function has the shape shown in Fig. 3-4. The effective width of 
such a curve is related to the parameter b. For this particular function, 
approximately two-thirds of the area under the curve lies between —b 
and +b. 
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Fig. 3-4 The form of the gaussian func- 
tion G(y) = e74"/2°. The curve has 
the same shape as the normal distribu- 
tion with a standard deviation b. 





We do not know how to design metal parts for our imaginary ex- 
periment which will produce such a gaussian slit. However, there is no 
conceptual difficulty. We now have a situation in which the particles at 
time T are distributed along the x axis with a relative amplitude pro- 
portional to the function G(y). (The relative probability is proportional 
to [G(y)]2.) If the particles move classically, we would expect, after a 
succeeding interval of time t’, to find them similarly distributed along 
the x axis with a new center a distance x} beyond X and an increased 
width parameter bı given by 


t t 
t = X— =b a | 
=X b (14 =) (3.23) 


as shown in Fig. 3-5. 
With such a gaussian slit the equation for the amplitude is 
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This integral is of the form 


CO oS 2 
J exp{ay? + By} dy = y= exp l-E) for Rea} <0 (3.25) 


which is integrated by completing the square in the exponent. Thus the 
amplitude becomes 


(3.24) 
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The classical velocity to get from the origin to the center of the slit 
is V = X/T. When we use this as a substitution and rearrange some of 
the terms, the expression for the amplitude becomes 


m ee S 
plz) = 4/ F (T4: +it ri) (3.27) 


x exp E (= + ver) + ee ee 
2h Ce (im/h)(1/t! +1/T) — 1/0? 

We shall consider first the relative probability for the particle to ar- 
rive at various points along the x axis. This probability is proportional 
to the absolute square of the amplitude. The absolute value of an expo- 
nent with an imaginary argument is 1. So, by rationalizing the second 
factor and the denominator in the last exponent of Eq. (3.27), we obtain 


Re, ED (x — Vt’)? 
Pe) = ae exp { (Aa? (3.28) 


where we have used the substitution 
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Fig. 3-5 The paths of particles moving through a gaussian slit. If the particles obeyed 
classical laws of motion, as shown here, then the distribution at time T +t’ would have 
the same form as the distribution at time T. The difference would be only a spreading 
out proportional to the time of flight. The characteristic width of the distribution would 
increase from 2b to 2b1, where bı = b(1 + t'/T). For quantum-mechanical motion, the 
actual spreading is greater than this. 
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As we expected, the distribution is a gaussian centered about the 
point x} = Vt of Eq. (3.23). However, the spread of the distribution, 
Az, is larger than the classical expected value bı of Eq. (3.23). This 
can be interpreted in the following manner. Suppose a, and ag are two 
independent random variables whose root-mean-square deviations about 
their average values are respectively a, and ag. Then if ag = a) + a, 
the rms deviation of ag about its average value is a3 = (a? + a2)!/?. 
Now, the rms deviation in a particular distribution is a measure of the 
spread, or width, of the distribution. As a matter of fact, for the gaussian 
distribution e~* /2°° the rms value is b, 

Thus in the present case we find that the quantum-mechanical system 
acts as if it had an extra random variable gə whose rms deviation is 


Azo =— 3.3 
T2 ae ( 0) 


It is this extra deviation Azz, or spreading, rather than the apparent 
extra variable x2, which has physical significance. That this term is 
quantum-mechanical in nature is clear from the inclusion of the constant 
fh. Such a term is important for particles of small mass and for narrow 
slits. 

Thus quantum mechanics tells us that for small particles, passage 
through a narrow slit makes the future position uncertain. This un- 
certainty Az2 is proportional to the time interval t between passage 
through the slit and the next observation of position. If we introduce 
the classical notion of velocity, we can say that passage through a slit 
causes a velocity uncertainty whose size 1s 


h 


We could take the width parameter 2b of the slit as a measure of the 
uncertainty of the position of the particle at the time it passed through 
the slit. If we call this uncertainty ôx and write the product mw as the 
momentum p, then Eq. (3.31) becomes 


bp ba = Oh (3.32) 


Once more we have arrived at a statement of the uncertainty prin- 
ciple. It tells us that, although classically the velocity might be known, 
the future position has an additional uncertainty as though a random 
momentum had been generated by passing through a slit of width dz. If 
classical concepts are used to describe the results of quantum mechan- 
ics qualitatively, then we would say that knowledge of position creates 
uncertainty in momentum. 
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What about the factors that appear in front of the exponent in 
Eq. (3.28)? If we integrate this expression over the complete range of x 
from —oo to +ooọ, the result is 


eby (8.83) 


This must be the probability that the particle gets through the slit, since 
the integration includes those particles and only those particles which 
did get through. But we have another way of obtaining this result. 
Suppose we take the absolute square of the kernel K(X + y,7;0,0), 
which comprises the second half of the integrand in Eq. (3.20). This 
is just the probability per unit distance that the particle arrives at the 
point X + y in the slit. That is, 


This M is independent of the position along the slit. Thus, 
if we were to multiply it by the width of the slit, we would obtain the 
total probability for the particle to arrive at the slit. This implies that 
the effective width of our gaussian slit is yr b. Had we used the original 
sharp-edged slit, we would have found the effective width to be 2b. 


Plany #2) = 


Problem 3-3 By squaring the amplitude given in Eq. (3.20) and 
then integrating over x, show that the probability of passage through 
the original sharp- a] slit is 


P(going through) = sara (3.35) 


In the course of this problem the integral 


CO 
| eda (3.36) 
— CO 
will appear. This is the integral representation of the Dirac delta func- 
tion of a.! | | 
Thus the quantum-mechanical results agree with the idea that the 
probability that a particle goes through a slit is equal to the probability 
that the particle arrives at the slit. 


Momentum and Energy. Next we shall verify again that, when 
the momentum is definite, the amplitude varies as et”, We return to a 
detailed study of the amplitude given in Eq. (3.27). This time we shall 
try to arrange conditions in our experiment so that the particle velocity 
after passing through the slit is known as accurately as possible. 


'See Eq. (A.9) in the table of integrals in the Appendix and L.I. Schiff, “Quantum 
Mechanics,” 2nd ed., pp. 50-52, McGraw-Hill Book Company, New York, 1955. 
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Quite apart from any quantum-mechanical considerations, there is a 
classical uncertainty of b/T in the velocity. For any given slit width we 
can make this uncertainty negligible by choosing T very large. We can 
also make X extremely large so that the average velocity X/T = V does 
not go to 0. Considering V and t’ as constants, the expression for the 
amplitude in the limit of large T is 
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Next we arrange that the quantum-mechanical uncertainty in mo- 
mentum h/b is very small. That is, we take b very large, so we can 
neglect 1/b?. Then the amplitude can be written 


imV , imV?, 
— t 








p(x) ~ const exp l (3.38) 


ho Dh 

This is an important result. It says that, if we have arranged things 
so that the momentum of a particle is known to be p, then the amplitude 
for the particle to arrive at the point x at the time t is 

N i i p? 
w(x) ~ const exp E = Za (3.39) 
We notice that this is a wave of definite wave number k = p/ħ. Further- 
more, it has a definite frequency w = p?/2mh. This means we can say 
that a free particle of momentum p has a definite quantum-mechanical 
energy (defined as h times frequency) which is p?/2m just as in classical 
mechanics. 

The probability of arriving at any particular z, which is proportional 
to the square of the amplitude, is in this case independent of x. Thus 
exact knowledge of velocity means no knowledge of position. In arrang- 
ing the experiment to give an accurately known velocity we have lost 
our chances for an accurate prediction of position. We have already 
seen that the reverse is true. The existence of the quantum-mechanical 
spread, inversely proportional to the slit width 2b, implies that an exact 
knowledge of position precludes any knowledge of velocity. So, if you 
know where it is, you cannot say how fast it is going; and, if you know 
how fast it is going, you cannot say where it is. This again illustrates 
the uncertainty principle. 


P(x) 
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RESULTS FOR A SHARP-EDGED SLIT 


Leaving the limiting case, suppose we return to a situation in which the 
slit width and quantum-mechanical spread are comparable in size and 
the times and distances of travel are not extremely large. We have seen 
that a gaussian slit leads to a gaussian distribution. If we use that more 
realistic sharp-edged version and work out the resulting Fresnel integrals, 
the probability distribution at the time t’ after passing through the slit 
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Fig. 3-6 The distribution of particles that have passed through sharp-edged 
slits of various widths. These distributions are symmetric about the mean 
position V (T +t’) so only the right-hand halves are shown. The classically- 
predicted width bı = b(1 + t'/T) is indicated by a dashed vertical line. The 
three distributions differ in the ratio of the classical width bı to the quantum- 
mechanical spreading Azə: For curve (a), bı/Az2 = 15; for curve (b), 
bı/Az2 = 1; and for curve (c), bı/Az2 = 1/15. In each case, the distri- 
bution spreads beyond the classical width. The rms width of the distribution 
is approximately equal to [(b1)? + (Aze)?]?/?. 


56 3 Developing the concepts with special examples 


This distribution is expressed° by 


Pla!) = sa (GEU — Clu)? + 5 (Su) = Sl) ) (840) 


where 
(2 — Vt’) + b(1 + t'/T) 


(3.41) 
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and C(u) and S(u) are the real and imaginary parts of the Fresnel 
integral. The first factor in this probability distribution is identical to 
the probability distribution of a free particle given in Eq. (3.6). The 
remaining factor contains a combination of real and imaginary Fresnel 
integrals.’ It is this factor which is responsible for the variations shown 
in the curves of Fig. 3-6. 

Thus for both slits the general result is the same. The most probable 
place to find the particle is within the classical projection of the slit. 
Beyond this there is the quantum-mechanical spreading. 

We have treated this problem as if it were a combination of two 
separate motions. First the particle goes to the slit, and then it goes 
from the slit to the point of observation. The motion seems almost 
disjointed at the slit. It might be asked then, how does a particle with 
such a disjointed motion “remember” its velocity and head in the general 
direction predicted by classical physics? Or, to put it another way, how 
does making the slit narrower cause as “loss of memory” until, in the 
limit, all velocities are equally likely for the particle? 

To understand this, let us investigate the amplitude to arrive at the 
slit. This is just the free-particle amplitude given by Eq. (3.3), with 
Lo = 0, ta = 0, % = X +y, and t = T. As we move across the 
slit (vary y), both real and imaginary parts of the amplitude vary sinu- 
soidally. As we have seen, the wavelength of this variation is connected 
to the momentum (refer to Eq. 3.10). The subsequent motion is a re- 
sult of optical-like interference among these waves. The interference is 
constructive in the general direction predicted by classical physics and, 
in general, destructive in other directions. 

If there are many wavelengths across the slit (i.e., the slit is very 
wide) the resulting interference pattern is quite sharp and the motion 
is approximately classical. But suppose the slit is made so narrow that 
not even one whole wavelength is included. There are no longer any 
oscillations to give an interference, and velocity information is lost. Thus 


1Refer to p. 34 of E. Jahnke and F. Emde, “Tables of Functions,” Dover Publica- 
tions, Inc., New York, 1943. 
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in the limit as the slit width goes to zero all velocities are equally likely 
for the particle. 


THE WAVE FUNCTION 


We have developed the amplitude for a particle to reach a particular 
point in space and time by closely following its motion in getting there. 
However, it is often useful to consider the amplitude to arrive at a par- 
ticular place without any special discussion of previous motion. Then 
we would say w(z,t) is the total amplitude to arrive at (x,t) from the 
past in some (perhaps unspecified) situation. Such an amplitude has the 
same probability characteristics as those we have already studied; 1.e., 
the probability of finding the particle at the point x and at the time t 
is |W(x,t)|?. We shall call this kind of amplitude a wave function. The 
difference between this and the amplitudes we have studied before is 
just a matter of notation. One often hears the statement: The system 
is in the “state” w. This is just another way of saying: The system is 
described by the wave function ~(2, t). | 

Thus the kernel K (2p, ty; Za, ta) = (2p, ty) is actually a wave func- 
tion. It is the amplitude to get to (zp, tẹ). The notation K (2p, tb; Za, ta) 
gives us more information, in particular, that this is the amplitude for 
a special case in which the particle came from (£a,ta). Perhaps this 
information is of no interest to the problem, so that there is no point 
in keeping track of it. Then we just use the wave function notation 
wW( Lp, ty) . 

Since the wave function is an amplitude, it satisfies the rules for 
combination of amplitudes for events occurring in succession in time. 
Thus since Eq. (2.31) is true for all points (£a, ta), we see that the wave 
function satisfies the integral equation — | 


W(xp, to) =| K (2p, tb; Le, be) W (Le, te) dTe (3.42) 


This result can be stated in physical terms. The total amplitude 
to arrive at (£p, tẹ) [that is, w(x», ty)| is the sum, or integral, over all 
possible values of x, of the total amplitude to arrive at the point (ze, te) 
[that is, Y(£e,te)| multiplied by the amplitude to go from c to b |that 
is, K (xp, ty; 2c,t-)|. This means that the effects of all the past history 
of a particle can be expressed in terms of a single function. If we forget 
everything we knew about a particle except its wave function at a par- 
ticular time, then we can calculate everything that can happen to that 
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particle after that time. All of history’s effect upon the future of the 
universe could be obtained from a single gigantic wave function. 


Problem 3-4 Suppose a free particle has a definite momentum at 
the time t = 0 (that is, the wave function is Ce’(?/)*), With the help of 
Eqs. (3.3) and (3.42), show that at some later time the particle has the 
same definite momentum (i.e., the wave function depends on æ through 
the function e*?/*)*) and varies in time as e~/ h)(p*/2m)t This means 
that the particle has the definite energy p*/2m. 


Problem 8-5 Use the results of Prob. 3-2 and Eq. (3.42) to show 
that the wave function of a free particle satisfies the equation 


Ow if R 8 
A EE ey 4 
Ot ħ | 2m A] aea 


which is the Schrödinger equation for a free particle. 


GAUSSIAN INTEGRALS 


We are finished with the physical portion of this chapter, and we now 
proceed to mathematical considerations. We shall introduce some addi- 
tional mathematical techniques which will help us to compute the sum 
over paths in certain situations. 

The simplest path integrals are those in which all of the variables 
appear up to the second degree in an exponent. We shall call them 
gaussian integrals. In quantum mechanics this corresponds to a case in 
which the action S$ involves the path x(t) up to and including the second 
power. 

To illustrate how the method works in such a case, consider a particle 
whose lagrangian has the form 


L = a(t)#* + b(t)ex + c(t)x? + dlt) + e(t)x + f(t) (3.44) 


The action is the integral of this function with respect to time between 
two fixed end points. (Actually, in this form the lagrangian is a little 
more general than necessary. The factor t could be removed from those 
terms in which it is linear through an integration by parts, but this fact 
is unimportant for our present purpose.) We wish to determine 

1 


K(b,a) = [ exp E [ Lie at) it} Dat) (3.45) 


the integral over all paths which go from (£a, ta) to (Zp, tp). 
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Of course, it is possible to carry out the integral over all paths in the 
way which was first described (in Sec. 2-4) by dividing the region into 
short time elements, and so on. That this will work follows from the fact 
that the integrand is an exponential of a quadratic form in the variables 
z and x. Such integrals can always be carried out. But we shall not 
go through this tedious calculation, since we can determine the most 
important characteristics of the kernel in the following manner. 

Let %(t) be the classical path between the specified end points. This 
is the path which is an extremum for the action S. In the notation we 
have been using 


Salb, a]l = Sz (t) (3.46) 
We can represent x(t) in terms of z(t) and a new function y(t): 
x(t) = z(t) + y(t) (3.47) 


That is to say, instead of defining a point on the path by its distance 
z(t) from an arbitrary coordinate axis, we measure instead the deviation 
y(t) from the classical path, as shown in Fig. 3-7. 





T 


Fig. 3-7 The difference between the classical path Z(t) and some possible 
alternative path x(t) is the function y(t). Since the paths must both reach the 
same end points, y(ta) = y(t») = 0. In between these end points y(t) can take 
any form. Since the classical path is completely fixed, any variation in the 
alternative path x(t) is equivalent to the associated variation in the difference 
y(t). Thus, in a path integral, the path differential Dx(t) can be replaced by 
Dy(t), and the path x(t) by z(t) + y(t). In this form Z(t) is a constant for the 
integration over paths. Furthermore, the new path variable y(t) is restricted 
to take the value 0 at both end points. This substitution leads to a path 
integral independent of end-point positions. 
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At each t the variables x and y differ by the constant z. (Of course, 
this is a different constant for each value of t.) Therefore, clearly, dx; = 
dy; for each specific point t; in the subdivision of time. In general, we 
may say Da(t) = Dy(t). 

The integral for the action can be written 


S{e(t)| = SIE) + y(t)] = | ‘la(t)(@? +289 +92) + Jdt (3.48) 


If all the terms which do not involve y are collected, the resulting integral 
is just S[Z(t)] = Sa. If all the terms which contain y as a linear factor 
are collected, the resulting integral vanishes. This could be proved by 
actually carrying out the integration (some integration by parts would 
be involved); however, such a calculation is unnecessary, since we already 
know the result is true. The function Z(t) is determined by this very 
requirement. That is, Z(t) is so chosen that there is no change in S, to 
first order, for variations of the path around z(t). All that remains are 
the second-order terms in y. These can be easily picked out, so that we 
can write | 


S{x(t)| = Salb, a] + J Í [a(t)y* + b(tjýy + e(t)y*| dt (3.49) 


The integral over paths does not depend upon the classical path, so 
the kernel can be written 


0 . ty 
K (b,a) = eSuba | exp [i J aoa? + vein + ele? it} Dy(t) 
(3.50) 


Since all paths y(t) start from and return to the point y = 0, the 
integral over paths can be a function only of times at the end points. 
This means that the kernel can be written as 


K(b,a) = e/P Salba] F(t, ta) (3.51) 


so K is determined except for a function of ty and ta. In particular, the 
kernel’s dependence upon the spatial variables x» and £a is completely 
worked out. It should be noted that the dependence of the kernel upon 
the coefficients of the linear terms d(t) and e(t) and the remaining coef- 
ficient f(t) is also completely worked out. 

This seems to be characteristic of various methods of doing path in- 
tegrals; a great deal can be worked out by some general methods, but 
often a multiplying factor is not fully determined. It must be deter- 
mined by some other known property of the solution, as, for example, 


by Eq. (2.31). 
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It is interesting to note that the approximate expression K ~ e@Se/h 


is exact for the case that S' is a quadratic form. 


Problem 3-6 Since the free-particle lagrangian is quadratic, show 
that (Prob. 2-1) 


; nN 
K(b, a) = F'(tp, ta) exp eee | 


2h (ty — ta) PA 


and give an argument to show that F can depend only on the difference 
Fite tg): 


Problem 3-7 Further information about this function F' can be 
obtained from the property expressed by Eq. (2.31). First notice that 
the results of Prob. 3-6 imply that F(t, — ta) can be written as F(t), 
where t is the time interval ty —t,. By using this form for F in Eq. (3.52) 
and substituting into Eq. (2.31), express F(t + s) in terms of F(t) and 
F(s), where t = tp — te and s = te — ta. Show that if F is written as 





1/2 
re an een 
the new function f(t) must satisfy 
F+ s) = FF (s) (3.54) 
This means that f(t) must be of the form 
f(t) = e7 (3.55) 


where a may be complex, that is, a = œ + iĝ. It is difficult to obtain 
more information about the function f(t) from the principles we have 
so far laid down. However, the special choice of the normalizing factor 
A defined in Eq. (2.21) implies that f(e) = 1 to first order in e. This 
corresponds to setting a in Eq. (3.55) equal to 0. The resulting value of 
F(t) is in agreement with Eq. (3.3). 


It is clear from this example how the important properties of path 
integrals may be easily obtained even though the integrand may be a 
complicated function. So long as the integrand is an exponential function 
which contains the path variables only up to the second order, a solution 
that will be complete except possibly for some simple multiplying factors 
can be obtained. This is true regardless of the number of variables. 
Thus, for example, a path integral of the form 


l d rb 
/ as J / exp{E[x(t), y(t),..., z(t)]} Dx(t) Dy(t)---Dz(t) (3.56) 
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contains as its important factor e”:, where Ea is the extremum of E 


subject to the boundary conditions. The only restriction is that in terms 
of the variables xz, y, and so on, E is a function of the second degree. 
The remaining factor is a function of the times at the end points of 
the paths. For most of the path integrals which we shall study, the 
important information is contained in the exponential term rather then 
in the latter factor. In fact in most cases we shall not even find it 
necessary to evaluate this latter factor. This method of solving path 
integrals will be used frequently in the succeeding chapters. 


MOTION IN A POTENTIAL FIELD 


One simple application comes in the classical limiting case in which the 
action S is very large compared to Planck’s constant h. As we have 
already pointed out for this situation, the kernel K is approximately 
proportional to e*%</”, We can now see more mathematically the basis 
of this approximation. Only those paths quite near to the classical 
path Z(t) are important, so suppose we make the substitution x(t) = 
z(t) + y(t). Now if the particle is moving through the potential V(z), 
we can write 


Viz) = V(é+y) = V(z) + yV' (ZB) + evra) + Lyca) +- (3.57) 


where the prime indicates differentiation with respect to x and all deriva- 
tives are evaluated along the classical path z. Only small values of y are 
important, so suppose V is a sufficiently smooth function that we can 
neglect terms of order y? and higher. Thif means that we assume that 
yV” and all higher-order terms are negligible compared to the terms 
kept. 

Under this assumption the integrand can be expressed as a quadratic 
form in y. In fact, since makes S extreme, 


S = Sq + terms second order in y. 


The important term in the result is e*-/”, where now, of course, Se 
contains the potential V(Z) along the classical path. The remaining 
integral over y goes from 0 to 0 and is of the form of the last factor in 
Eq. (3.50). It provides a smooth function as a factor to e*e/*, 
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The result is true in situations other than the classical limiting case. 
For example, suppose V(x) is a quadratic function of x. Then the solu- 
tion is exact, since the expansion of V(x) as in Eq. (3.57) contains no 
powers higher than the second. Some examples of this type are given 
in the problems. As another example, suppose V(x) is a slowly varying 
function. In particular, if the third and higher derivatives are extremely 
small, the result given above is a very accurate approximation. This 
particular case is called the WKB approximation in quantum mechan- 
ICS. 

There are other situations in which the approximation is good. Sup- 
pose the total time interval for the motion is very short. If a particle 
moves along a path differing greatly from the classical path, it must have 
a very large extra velocity (to go out from the initial point and then re- 
turn to the final point in the allotted time interval). The extra kinetic 
energy is proportional to the square of this large velocity, and the action 
contains a term roughly proportional to the kinetic energy multiplied 
by the time interval (thus, the square of the velocity multiplied by the 
time interval). The action for such paths will be very large, and the 
phase of the amplitude will vary greatly for closely neighboring paths. 
In this case again it is reasonable to drop the higher-order terms in the 
expansion of V(x). 


Problem 3-8 For a harmonic oscillator the lagrangian is 


2 
M.o MW » 





L = Tr ’ 
zÝ z 7 | (3.58) 
Show that the resulting kernel is (see Prob. 2-2) 
amu 2 2 
= oe 2 ; T — 222 | 
K = F(T) exp E an Tt + 27) cos w Lx 1 (3.59) 


where T' = tp — tg. Note that the multiplicative function F(T) has not 
been explicitly worked out. It can be obtained by other means, and for 
the harmonic oscillator® it is (see Sec. 3-11) 


F(T) = ( 


MU r 


—_—_ 3.60 
2riħ sin wT l ) 
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Problem 3-9 Find the kernel for a particle in a constant external 
field f where the lagrangian is 


i= se + fa (3.61) 


The result is 


7 ( m M x Tm omda : fT(ap+2a) f*T? 
~ \onihT Pik oT 5 24m 





(3.62) 


where T = ty — ta. 


Problem 3-10° The lagrangian for a particle of charge e and mass 
m in a constant external magnetic field B, in the z direction, is 





B 
L= (a? +9? + 2) + Eeh- ey) (3.63) 
Show that the resulting kernel is 
m 3/2 wT /2 Zb — Ba) )? 
.64 
aa (os wT /2) apd Th Ea een 


+ (eg) lEs- za)? + (o — ya)?] + olya = 2o) 


where T = ty — ta and w = eB/me. 


Problem 3-11 Suppose the harmonic oscillator of Prob. 3-8 is 
driven by an external force f(t). The lagrangian is 


2 
D= ze — > z’ + f(t)x (3.65) 


Show that the resulting kernel is (with T = tp — ta) 
= ( da - iSe /h 


27th sin wT 
where 
MW 

oni Fin T E + 2%) coswT — 2rpXq (3.66) 
206 to 
— t) si t— ta) dt 
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to 
+ f(t) sin w(t, — t) dt 





th pt 
= — | | f(t) f(s) sin w(ty — t) sinw(s — ta) ds dt 
tad le 
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This last result is of great importance in many advanced problems. 
It has particular applications in quantum electrodynamics because the 
electromagnetic field can be represented as a set of forced harmonic 
oscillators. 


Problem 3-12 If the wave function for a harmonic oscillator is (at 


r= 0) | 

w(x,0) = exp -a — a)? (3.67) 
then, using Eq. (3.42) and the results of Prob. 3-8, show that 

W(x, T) = exp J-a — = ee ore a cos(w je) | 


(3.68) 


and find the probability distribution |w|?. 


SYSTEMS WITH MANY VARIABLES! 


Suppose a system has several degrees of freedom. A kernel for such a 
system can be represented by the form of Eq. (2.25), where the symbol 
z(t) now represents several coordinates rather than just one. 

We take as a first example a particle moving in three dimensions. 
The path is defined by giving three functions x(t), y(t), and z(t). The 
action for a free particle, for example, is | 


S j eo + y(t)? + 2(t)?| dt 


The kernel to go from some initial point (£a, Ya, Za) at time ta to a final 
point (£b, Yb, Zb) at time ty is 


K Dag Von 255 06s as Vas asta) = | (3.69) 
b i to m 
/ exp E / z t (t) +ý (t) + 2 (t)] it} Da(t) Dy(t) Dz(t) 
a La 
The differential is written as Dx(t) Dy(t) Dz(t). If the time is divided 
into intervals €, the position at the time t; is given by three variables 


Ti, Yi, Zi and the integral over all variables is dz;, dy;, dz; for each í 
in an expression like Eq. (2.22). (More generally, if we represent the 


1R.P. Feynman, Space-Time Approach to Non-relativistic Quantum Mechanics, 


Rev. Mod. Phys., vol. 20, p. 371, 1948. 
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position by a vector x in some s-dimensional space, the differential at 
each time interval is the volume element d°x; and the corresponding 
path differential is D*x.) 

If the definition of Eq. (2.22) is used, then the normalizing constant 
A (Eq. 2.21) must be included for each variable in each time interval. 
Thus if the total time interval is broken up into N steps of length €, the 
factor ATN must be included in the integral. 

Another situation involving several variables is that of two interact- 
ing systems. Suppose one system consists of a particle of mass m and 
coordinate x while the other system is a particle of mass M and co- 
ordinate X. Suppose these two systems interact through a potential 
V(z,X). The resulting action is 


Sla(t), X(t)| = | i Ta $ Z y? z V (2, X) dt (3.70) 


so that the kernel is 


b pb 
Kt», Xote ta Xorta) = | f exp = ste(t), XO} Dele) Dx 
(3.71) 


One might understand this generalization of Eq. (2.25) mathemati- 
cally. Thus one might consider the motion of a point in some abstract 
two-dimensional space of coordinates x, X. However, it is much easier to 
think of it physically as representing the motion of two separate particles 
whose coordinates are respectively x and X. Then K is the amplitude 
that the particle of mass m goes from the point in space-time (£a, ta) 
to (Zp, tb) and the particle of mass M goes from (Xa, ta) to (Xb, tp). 
The kernel is then the sum of an amplitude taken over all possible paths 
of both particles between their respective start and end points. The 
amplitude for any particular pair of paths (i.e., both z(t) and X(t) are 
specified) is e*°/", where S is the action defined in Eq. (3.70). Mathe- 
matically, the amplitude is a function of two independent functions x(t) 
and X(t), and the integral is over both of the variable functions. 


SEPARABLE SYSTEMS 


Suppose we have a situation in which two particles are present, both 
moving in one or perhaps more dimensions. Let the vector x represent 
the coordinates of one particle and the vector X represent the coor- 
dinates of the other, as in the paragraph above, except that now we 
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extend the picture to a three-dimensional space. It may happen that 
the resulting action can be separated into two parts, as 


Six, X] = S_[x] + Sx [X] | (3.72) 


where S, involves only the paths x(t) and Sx involves only the paths 
X(t). This is the situation when the two particles do not interact. 

In this case the kernel becomes the product of one factor depending 
on x alone and another depending on X alone. Thus 


K © come, CMe Raala ) 


-ff op | | Sz[x] + Sx (Xf D°x(t) D°X(t) 
= [ exp | Esai I} D3 x(t) i exp f sxx} DX (t) 


= Koy (Sibi Mata TS eos Xala) (3.73) 


Here K, is the amplitude computed as if only the particle of coordinates 
x were present, and K x is defined similarly. Thus in a situation involving 
two independent noninteracting systems, the kernel for an event involv- 
ing both systems is the product of two independent kernels. These are 
the kernels for each particle to carry out its individual portion of the 
overall event. 

The wave function in a situation involving several particles is defined 
in a Straightforward manner by analogy with the corresponding kernel 
as w(x, X,...,t). It is interpreted as the amplitude that, at time t, one 
particle is at the point x, another particle is at the point X, etc. The 
absolute square of the wave function is the probability that one particle 
is at point x per unit volume, another particle is at the point X per unit 
volume, etc. Equation (3.42), which holds for the one-dimensional case, 
can be immediately extended to read 


J (3.74) 
/ e. KX a AR est UR A dl a A R on 


Where d?x’ is the product of as many differentials as there are coordi- 
nates in x’ space. 

In case two independent particles are represented by the sets of co- 
ordinates x and X, then the kernel K is the product of one function of x 
and t and another of X. and t, as mentioned above. However, this does 
not imply that, in general, ~ is such a product. In the special case that 
: is at some particular time a product of a function of x and another of 

X (thus Y = f(x)g(X)), then it will remain so for all time. Each factor 
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will change as it would for the partial system alone, since the kernel K 
represents the independent motion of two particles. But this is a special 
case. Just because the particles are independent now does not mean 
that they always were. There may have been some interaction in the 
past, which would imply that w is not a simple product. 

Even though the action S does not appear as a simple product in the 
original coordinate system, there is often a transformation (such as that 
of center-of-gravity and internal coordinates) which will make it separa- 
ble. Since the same form for the action is used in quantum mechanics as 
in classical physics, any transformation which will separate a classical 
system will also separate the corresponding quantum-mechanical sys- 
tem. Thus a part of the great body of work in classical physics can be 
applied directly to quantum mechanics. Such transformations are very 
important. It is hard to deal with a system consisting of several vari- 
ables. Separation of variables permits us to reduce a complex problem 
to a number of simpler problems. 


THE PATH INTEGRAL AS A FUNCTIONAL 


When a problem contains more than one variable and a separation is 
not possible, the analysis is generally very difficult. Later on we shall 
discuss some approximations which can be applied to this case. Here we 
shall describe one very powerful tool which can sometimes be applied. 
Consider the kernel given by Eq. (3.71). This can be written out in 
detail as 


b pb . tp , tp 
K(b,a)= f | api f peat | SX? 


-* [ V(x, X,t) it} Da(t)DX(t) (3.75) 


First, suppose we carry out the integral over the paths X(t). The 
result can be written formally as 


K(b,a) = [ exp f: [ Ti it} T |x(t)| Drt) (3.76) 
where 
Pie) | = [ exp E [ Pas —V(a,X, J it} DX (t) (3.77) 


These results are interpreted in the following manner. Integrating 
over all paths available to the X particle produces a functional T. A 
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functional is a number whose value depends on specifying a complete 
function. For example, the area under a curve is a functional of the 
curve: A= f f(y) dy. To find it, a function (the curve) must be speci- 
fied. We write a functional as A[f(y)] to indicate that A depends on the 
function f(y). We do not write A(f(y)), for that might be interpreted 
as a function of a function, i.e., that A just depends what value f takes 
at some specified point y. This is not the case. Al f(y)| depends on the 
entire shape of the function f(y). It does not depend on y in any way. 
The functional defined in Eq. (3.77) is the amplitude that the X 


particle alone goes between its end points Xa and X, under the influence 


of a potential V. This potential, which depends upon both x and X, 
is computed assuming z is held to be a fixed path as X changes. Thus 
it is the potential for the X particle when the x particle is moving 
along a specific trajectory. Clearly, this amplitude T depends upon the 
trajectory chosen for x(t), so we write it as a functional of x(t). Then 
the total amplitude is obtained by summing over all paths a functional 
consisting of the product of T and the free-particle kernel for x(t). 

Thus the amplitude K, like all others, is a sum over the amplitudes 
of all possible alternatives. Each of these amplitudes is a product of 
two lesser amplitudes. The first of these is the amplitude T that the X 
particle goes between its given end points when æ has a specified trajec- 
tory. The second is the amplitude that x has that specified trajectory. 
The final sum over alternatives becomes the sum over all possible tra- 
jectories of x. It is important to understand this concept clearly, for it 
includes one of the fundamental principles of quantum electrodynamics, 
a subject which will be taken up in a later chapter. 

Of course it is not practical to use this method unless the integral 
T can actually be worked out, either exactly or approximately, for the 
possible values of the trajectory x(t). As we have seen (in Prob. 3-11) 
one exact case it that in which X is a harmonic oscillator. This is a very 
important practical case. For example, when a particle interacts with a 
quantized field, the field is an oscillator. 


INTERACTION OF A PARTICLE AND 
A HARMONIC OSCILLATOR 


We shall consider in more detail the interaction of a particle and a 
harmonic oscillator. Let the coordinate of the particle be x and that of 
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the oscillator be X. The action can be written as 
tp tp M 
Six, X) = Sola] + J g(w(t), t)X(t) dt + / Z (X? — uw? X?) dé (8.78) 
ba ta 


where Spo is the action of the particle in the absence of the oscillator. In 
the discussion above (Sec. 3-9) we assumed that this action corresponded 
to that for a free particle. This assumption is not necessary. The motion 
of x could be complicated by the existence of a potential depending upon 
x and t only. Thus, for example, the action So might be 


Solz] = i i? —V(a,t)| dt (3.79) 
ta 2 


The second term in Eq. (3.78) represents the interaction between the 
particle and the oscillator. Note that this term is linear in X. Omission 
of a dependence upon X does not imply any loss in generality, since 
if such a term were to occur, it could be removed by an integration 
by parts. We can call the coefficient g the coupling coefficient. Its 
dependence upon z(t) is indicated, but it could also depend upon other 
variables, such as z(t). Since the analysis we are presenting is general, 
it is not important to write down the exact form of g. The last term in 
Eq. (3.78) is, of course, the action of the oscillator alone. By combining 
this with the second term, the functional T of Eq. (3.77) can be written 
as 


T[x(t)| = [ exp Ẹ [ ott), 0X0 + a - x?) it} DX (t) 
(3.80) 


Now as far as X is concerned, the situation is just that of a forced 
harmonic oscillator. The forcing function g((t),t) is some special func- 
tion of t, say, f(t). Thus the path integral is the same as that considered 
in Prob. 3-11, with f(t) replaced by g(x(t),t) and the final and initial 
coordinate values (£b, £a) replaced by (Xp, Xa). 

For illustrative purposes, to simplify the expressions somewhat, we 
take the special case in which the oscillator initially and finally is at the 
origin, so X, = Xa = 0. (The general case is just as easily handled.) 
Then according to Prob. 3-11 in this case we have 


Mw le = 
T = | — o 3.8 
(z sin =) ed l AM w sin wT ei 


X [Tf g(x(t), t)g(a(s), $) sinw(ty — t) sinw(s — ta) ds it} 
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Therefore, the kernel for the present situation can be written 


Mw 172 
270A sin =) 


« fe : Solz] — : 
a PR O MasinwT 


x IT glx(t), t)g(x(s), $) sinw(t, — t)sinw(s — ta) ds at Dat) 


K(b,a) = ( (3.82) 


with a similar (but more complicated) expression for arbitrary Xa, Xp. 

This is a more complicated path integral than any we have had to 
solve so far. It is not possible to proceed further with the solution until 
various methods of approximation have been developed in succeeding 
chapters. Note that the integrand of this path integral can still be 
thought of as being of the form e*°/”, but now S is no longer a function 
of only z, x, and t. Instead, S contains a product of variables defined at 
two different times, s and t. The separation of past and future can no 
longer be made. This happens because the variable x at some previous 
time affects the oscillator which, at some later time, reacts back to affect 
x. No wave function w(z,t) can be defined to give the amplitude that 
the particle is at some particular place x at a particular time t. Such 
an amplitude would be insufficient for continuing calculations into the 
future, since at any time one must also know what the oscillator is doing. 
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Consider the path integral for the harmonic oscillator problem (Prob. 
3-8). This is 


0 


K(b,a) = J ' ex { 5 [Fe -e a} Da(t) (3.83) 


Using the methods of Sec. 3-5 this path integral can be reduced to a 
product of two functions, as in Prob. 3-8. The more important of these 
two functions depends upon the classical orbit for a harmonic oscillator 
and is given in Eq. (3.59). The remaining function depends upon the 
time interval only and is written down in Eq. (3.60). This latter function 
can be written as 


F(T) = j Si j Tar hy Dy(t) (3.84) 
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We shall solve this, at least to within a factor independent of w, by a 
method which illustrates still another way of handling path integrals. 
Since all paths y(t) go from 0 at t = 0 to 0 at t = T, such paths can be 
written as a Fourier sine series with a fundamental period of T. Thus 


-5 An sin — — (3.85) 


It is mais then to specify a path through the coefficients a,, instead 
of through the function values y at any particular value of t. This is 
a linear transformation whose jacobian J is a dimensionless constant, 
obviously independent of w, m, and h. 

Of course, it is possible to evaluate this jacobian directly. However, 
here we shall avoid the evaluation of J by collecting all factors which 
are independent of w (including J) into a single constant factor. We can 
always recover the correct factor at the end, since we know the value for 
w = 0, namely F(T) = (m/27thT)!/2 (a free particle). 

The integral for the action can be written in terms of the Fourier 
series of Eq. (3.85). Thus the kinetic-energy term becomes 


T T 
m l nT MT nat mart 
a Í y? dt = ny j p Onam j cos Cos dt 
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and similarly the potential-energy term is 
2 T 2 T Sees 
| y? dt = Laie Re (3.87) 
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On the assumption that the time T is divided into discrete steps of 
length € (asin Eq. 2.19) so that there are only a finite number N of 
coefficients an, the path integral becomes 


im T nT \ 2 al 9 
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Since the exponent can be separated into factors, the integral over each 
coefficient an can be done separately. The result of one such integration 


is 
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(3.89) 








3-11 Evaluation of path integrals by Fourier series 73 
Thus the path integral is proportional to 


N 2.2 —1/2 N 2.92 —1/2 N 2m2 —1/2 
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The first product does not depend on w and combines with the jacobian 
and other factors we have collected into a single constant. The second 
factor has the limit [(sinwT)/wT]7!/? as N — oo, that is, as €e — 0. 
Thus 





n (3.91) 


Pro (28 


where C is independent of w. But for w = 0 our integral is that for a 
free particle, for which we have already found that 


m 1/2 
F(T) Ori AT (3.92) 
Hence for the harmonic oscillator we have 
Tj (=) 3.9 
FT) 2riħ sin wT (3.98) 


which is to be substituted in Eq. (3.59) to obtain the complete solution. 


Problem 3-13 By keeping track of all the constants, show that the 
jacobian satisfies 


(N+1)/2 N 
T 2T 1 
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The Schrodinger Description 
of Quantum Mechanics 


THE path integrals which we have discussed so far (with the exception 
of Eq. 3.82) have integrands which are exponentials of actions with the 


property 
Sb, a] = S[b, c] + S[e, al (4.1) 


Such path integrals can be analyzed in terms of the properties of integral 
equations which can be deduced from them. We have already seen this 
in Chap. 2 (e.g., Eq. 2.31) and Chap. 3 (e.g., Eq. 3.42). 

A still more convenient method is to reduce the path integrals to 
differential equations if possible. This possibility exists in quantum me- 
chanics and is, in fact, the most convenient way to present that theory. 
It is in almost every case easier to solve the differential equation than 
it is to evaluate the path integral directly. The conventional presenta- 
tion of quantum mechanics is based on this differential equation, called 
the Schrodinger equation. Here we shall derive this equation from our 
formulation. We shall not solve this equation for a large number of exam- 
ples, because such solutions are presented in a detailed and satisfactory 
fashion in other books on quantum mechanics.! 

In this chapter our purpose is twofold: (1) For the reader primarily 
interested in quantum mechanics our aim is to connect the path integral 
formulation with other formulations which are found in the standard 
literature and textbooks so that he can continue his study in those books 
and can learn to translate back and forth between the two different 
languages. (2) For the reader primarily interested in path integrals this 
chapter will show a technique which is available for a certain class of 
path integrals to reduce these path integrals to differential equations. 
This technique is best shown by the particular example of quantum 
mechanics which we shall develop here. 


THE SCHRODINGER EQUATION 


The Differential Equation Form. The reason that we can develop 
a differential equation is that the relationship of Eq. (4.1) is correct for 
any values of the points a, b, and c. For example, the time tẹ can be 
only an infinitesimal time e€ greater than the time te. This will permit 
us to relate the value of a path integral at one time to its value a short 
time later. In this manner we can obtain a differential equation for the 
path integral. 


‘For example, see L.I. Schiff, “Quantum Mechanics,” 2nd ed., McGraw-Hill Book 
Company, New York, 1955. 
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We have already found that as a consequence of Eq. (4.1) we can 
define a wave function. Furthermore, we know that the equation 


a | 
ip Leyte) a J Lig o tata asta) dLa (4.2) 
— CK) 

gives the wave function at a time tẹ in terms of the wave function at a 
time ta. In order to obtain the differential equation that we seek, we 
apply this relationship in the special case that the time tẹ differs only 
by an infinitesimal interval € from ta. The kernel K (b,a) is proportional 
to the exponential of 7/A times the action for the interval ta to ty. For a 
short interval e the action is approximately e times the lagrangian for this 
interval. That is, using the same approximation at that of Eq. (2.34), 
we have 


wetto=3/ apd Fel (=, 2E) hue) dy (4.3) 
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We shall now apply this to the special case of a particle moving in 
one dimension subject to a potential energy V (x,t), i.e., that for which 
= (m/2)t? — V (x,t). In this case Eq. (4.3) becomes 


ple, t +e) = af wi ae} 
x exp { -eV Ga wy, t) dy (4.4) 


The quantity (x — y)?/e appears in the exponent of the first factor. 
It is clear that if y is appreciably different from x, this quantity is very 
large and the exponential consequently oscillates very rapidly as y varies. 
When this factor oscillates rapidly, the integral over y gives a very small 
value (because of the smooth behavior of the other factors). Only if y is 
near x (where the exponential changes more slowly) do we get important 
contributions. For this reason we make the substitution y = z + 7 with 
the expectation that appreciable contributions to the integral will occur 
only for small 7. We obtain 


i; imn? 
p(z, t+ e) = a/ exp { a l 


x exp d -iev (z+ Ht) Went) dn (4.5) 





The phase of the first exponential changes from 0 to of 1 radian when 
n changes from 0 to \/2he/m, so most of the integral is contributed by 
values of 7 in this order. 
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We may expand Y in a power series. We need only keep terms of order 
e. This implies keeping second-order terms in 7. The term eV (x +n/2,t) 
may be replaced by eV (x,t) because the error is of higher order than e. 
Expanding the left-hand side to first order in e and second order in 7, 
we obtain 


Op 1 [™ imn? 
(2, pre =F fei | (4.6) 


ð 
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If we take the leading term on the right-hand side, we have the quantity 
w(x,t) multiplied by the integral 


sf. exp | a ma yan = 1 (2e) (4.7) 


On the left-hand side we have just y(x, t). In order that both sides agree 
in the limit € approaches 0, it is necessary that A be so chosen that the 
expression of Eq. (4.7) equals 1. That is, 


5. ni 
A= (2am (4.8) 
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as we have stated previously (see Eq. 2.21). This is a way of obtaining 
the quantity A in more complicated problems also. The A must be so 
chosen that the equation is correct to zero order in e. Otherwise, no 
limit will exist as € approaches 0 in the original path integral. 

In order to evaluate the right-hand side of Eq. (4.6), we shall have 
to use two integrals 








Lt imn? 7 
zf nepi The t dn =0 (4.9) 
and 
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= a 4.1 
gf Poof RE) o =a (4.10) 
Writing out the right-hand side of Eq. (4.6) gives 

z ihe O7w 

— = — ——— 4.11 

ee VN E n (4.11) 


This will be true to order c€ if y(x, t) satisfies the differential equation 


OW if K Oy 
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This is the Schrodinger equation for our problem of a particle moving in 
one dimension. Corresponding equations in more complicated situations 
can be worked out in the same way, as demonstrated by the following 
problems. 


Problem 4-1 Show that for a single particle moving in three di- 
mensions in a potential energy V(x,t) the Schrödinger equation is 
Ow (x, t) l he 2 
+ = — |-.—— V t V(x,t t 4.1 
This equation was discovered by Schrödinger in 1925 and formed the 
central feature of the development of quantum mechanics thereafter. 


The Operator Form. The equations which result from various 
problems corresponding to different forms for the lagrangian can all be 
written for convenience in the form 
Ow i 
—=--——H 4.14 
go p (4.14) 
Here H does not represent a number but indicates an operation on w. 
It is called the hamiltonian operator. For example, in Eq. (4.12) this 
operation is 

he 0? 
H = -——,+Vig,t 4.15 
2m Ox? + V(2,t) ( ) 

Such an equation with operators on both sides means this: If any 
function f(x) is written after each operator on each side, the equation 
will be true. That is, Eq. (4.15) symbolizes the statement: The relation 
h? 0? f (x) 
FLED) en 


Im Ox? 


holds for any function f(z). 





+V (x,t) f(x) (4.16) 


Problem 4-2 For a particle of charge e in a magnetic field the 
lagrangian is 


D= z% F “xA (x,t) — eġ(x, t) (4.17) 


where x is the velocity vector, c is the velocity of light, and A and 
ġ are the vector and scalar potentials. Show that the corresponding 
Schrodinger equation is 
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Thus the hamiltonian is 


= Ea (Žv — “A . (Žv — =A + ed (4.19) 
2m C 1 C 
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Problem 4-3 Show that the complex conjugate function y~*, defined 
as the function w with every t changed to —2, satisfies 
i 


E = +5(HY)* (4.20) 





The notation for operators can be described by giving a number 
of examples. For example, the operator x means multiplication by x, 
the operator x? means multiplication by x*, the operator V(x) (some 
function of x) means multiplication by V(x), the operator 0/Ox means 
partial differentiation with respect to x, Ow/Oz, etc. 

If A and B are operators, then the operator AB means that we first 
apply B and then A, that is, ABW means A(Bw). Thus, for example, 
the operator z(0/Ox) means x times OwW/Ox. On the other hand, the 
operator (0/Oxz)x means the partial derivative with respect to x of xy, 
or 


= (ay) =a +¥ 
He see that in or the operator AB and the operator BA are not 
identical. 

We further define the operator A+B by the rule that A+B operating 
on Y is AY + Bw. For example, the previous equation can be written as 
an equation among operators as follows: 


o O 


the meaning being that (0/Ox)xf = x(0/0x)f + f for any function f. 


Problem 4-4 Show 


0? 3? ð 
r 2 4.22 
Gn” ~ Bn? “Oe a 
and therefore that, for the H of Eq. (4.15), 
h ð 
Hi =a H = 4.23 
p-r (4.23) 


This operator notation is used a great deal in the conventional for- 
mulations of quantum mechanics. 
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The Schrödinger Equation for the Kernel. Since K (b, a), thought 
of as a function of the variables b, is a special wave function (namely, 
that for a particle which starts at a), we see that K must also satisfy a 
Schrödinger equation. Thus for the case specified by Eq. (4.15) 


ð i| R o? 

on Kb aj= -a | 2m Om? K(b, a) + V(b) K(b, a) i (4.24) 
for tẹ > ta. In general we have 

aiae tid oeri (4.25) 
Oty A 


wherein the operator H» operates on the b variables only. 


Problem 4-5 Using the relation 
K(b a) = / K(b,c)K(e, a) dx, (4.26) 


with te — tg = €, an infinitesimal, show that if tẹ is greater than ta, the 
kernel K satisfies 


EA 
Ota 


where H, now operates on the a variables only. 


K (b,a) = +5 HK (b, a) (4.27) 


The function K (b,a) defined by a path integral in Eq. (2.25) is de- 
fined only for ty > ta. The function is not defined if tẹ < ta. It will prove 
to be very convenient in later work (e.g., Chap. 6) to define K (b,a) to 
be zero for ta < ta. (With this convention Eq. (4.2), for example, is valid 
only if tẹ > ta.) With the condition 


K(b,a)=0 for ty < ta (4.28) 


it is evident that Eq. (4.25) is satisfied also for tẹ < ta (in a trivial 
fashion, of course, since K = 0). But this equation is not satisfied at 
the point tẹ = ta, because K (b,a) is discontinuous at ty = ta. 


Problem 4-6 Show that K(b,a) — (£p — £a) as ty > ta + 0. 


From the result of Prob. 4-6 we see that the derivative of K with 
respect to t, gives a delta function in the time multiplied by the height 
of the jump, (£p — £a). Hence K (b,a) satisfies 

O 


5K (b,a) = -+ HoK(b, a) + 5(a — 0)5 (to — ta) (4.29) 
b 
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This equation plus the boundary condition of Eq. (4.28) could serve 
to define K(b,a) if one were to have started out from the Schrödinger 
equation as the fundamental definition in quantum mechanics. It is 
clear that the quantity K(b,a) is a kind of Green’s function for the 
Schrodinger equation. 


The Conservation of Probability. The hamiltonian operator 
given by Eq. (4.15) has the interesting property that, if f, g are any 
functions which fall off to zero at infinity, 


f (Hg)* f dz = f g (Hf) dx (4.30) 


— 00 =09 


The meaning of the symbols is this. On the left we are to take g, operate 
on it with H (forming Hg), and then take the complex conjugate. The 
result is then multiplied by f and integrated over all space. The result 
is the same as taking H f, multiplying by the complex conjugate of g, 
and integrating. It is easily verified that this is true by integrating the 
term f(Hg)*f dx (by parts, where necessary). 

For our example in Eq. (4.15) we have for the left side of Eq. (4.30) 








h? CO d? g* OO . 
EOM- mth Vo (di= (4.31) 
h2 
E Ep gh] P oflu [vere 
2m 
(integrating by parts twice). If f, g, fall off at infinity, the integrated 


parts vanish and Eq. (4.30) is established. An operator which has the 

property given by Eq. (4.30) is called hermitian. In all cases of quantum 

mechanics the hamiltonian is hermitian. For more general cases than 

that considered above the integration over our one-dimensional variable 

x becomes an integration (or sum) over all the variables of the system. 
If we put f and g equal to w(z,t), we get 


| Goyva= f (HU) do (4.32) 


and if w satisfies the wave equation (4.14), this becomes 


[eva +f ve Ob a -4 (f? wpa) =0 bias 


That is, | w*wdz is a constant independent of time. This is easily 
interpreted. For if w is suitably normalized, w~*wW is the probability 
of being found at x; so the integral is the probability of being found 
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somewhere, which is certainty (or 1) and is constant. Of course, as far 
as the wave equation is concerned yw can be multiplied by any constant 
and still be a solution. Then y~*w is multiplied by the square of the 
constant, and the integral is this constant squared. 

It is fundamental to our definition of Y% as probability amplitude that 
the integral of w*w is constant. In terms of the kernel this means that if 
f is the wave function at time ta, then at time tẹ it has the same square 
integral. That is, if 


= 1 K(b,a) f(a) dza (4.34) 


then 


: y* (b) (b) dze = J f*(a) f(a) dza (4.35) 


ia B E K* (b; Tasta) K (b; £as ta) f” (Ta) f (Ta) day dLa dz, = 
j. — (4.36) 


For this to be true for arbitrary f we must have 
OO 
| Ke bkat IK (Oat. ) dt, = 0,2) (4.37) 


That is, in order to interpret w as a probability amplitude, the ker- 
nel must satisfy Eq. (4.37). We have derived this by means of the 
Schrodinger equation. It would be nicer to demonstrate this and other 
properties, such as Eq. (4.38) and Prob. 4-7, directly in terms of the 
path integral definition of K instead of coming through the differential 
equation. It is possible, of course, but it is not so simple or neat as 
a derivation of such a fundamental relation should be. One can verify 
Eq. (4.37) as follows: For a small interval with tẹ = ta + €, Eq. (4.37) 
follows directly from the expression e’*4/" for this interval. By induc- 
tion, the complete Eq. (4.37) results. One disadvantage of the approach 
to quantum mechanics through path integrals is the fact that relations 
involving w* or K* are not self-evident. 

By changing the variable name in Eq. (4.37) from a to c, multiplying 
by K(c,a), and integrating over ze, we find 


f K* (b,c) K(b,a) dza = K(c,a) (4.38) 


4-2 
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where, as usual, tẹ > te > ta. Compare this to [| K(b,c)K(c,a) dze = 
K(b,a). We may describe the second relation this way: Starting at ta, 
K(c,a) gives us the amplitude at the later time te. If we wish to go to 
a still later time tp, we can do so by using the kernel K(b,c). On the 
other hand, if having the amplitude at t, we want to work back to find it 
at an earlier time te < ty, we can do this by using the function K*(b, c), 
according to Eq. (4.38). That is, K*(b,c) undoes the work of K (c, b). 


Problem 4-7 Show that | K*(b,a)K(b,c) dz, = K*(c,a), where 
our usual convention of tẹ > te > ta holds. 


THE TIME-INDEPENDENT HAMILTONIAN 


Steady States of Definite Energy. The special case that the 
hamiltonian H is independent of time is of great practical importance. 
This corresponds to the case that the action S does not depend on the 
time explicitly; e.g., the potentials A and ¢, and the potential energy 
V, do not contain t. In this case the kernel cannot depend upon the 
absolute time but instead is a function only of the interval tẹ — ta. Asa 
consequence, there exist wave functions that depend periodically on the 
time. 

It is easiest to see what happens by studying the differential equation. 
Starting from the Schrédinger equation (4.14), we try a special solution 
of the form w#(z,t) = ¢(x) f(t), a function of position only multiplied by 
a function of time only. Substitution gives us the relation 


1 








(2) f(t) = -EAE (4.39) 
f(t) i Hela) 
TO hoa) an 


The left-hand side of this equation does not depend upon z, whereas the 
right-hand side does not depend upon t. Because they are always equal 
neither side can depend upon either variable t or x. That is, each side is a 
constant. Let us call the constant —(i/h)E. Then f’(t) = —(i/h) Ef (t), 
or f(t) = foe @/™** where fo is an arbitrary constant factor. Thus the 
special solution is of the form 


ple, t) = d(aje “Et (4.41) 
where (x) satisfies 


Ho(z) = E¢(z) (4.42) 
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That is, for this special solution the wave function oscillates with 
the same definite frequency at every point in space. We saw (Eq. 3.17) 
that the frequency with which a wave function oscillates corresponds, 
in classical physics, to the energy. Therefore, we say that when the 
wave function is of this special form, the state has a definite energy E. 
For each value of E a different particular function ¢(x), a solution of 
Eq. (4.42), must be sought. 

The probability that a particle is at x is the absolute square of the 
wave function y(x), or |y(x)|?. In view of Eq. (4.41) this is equal to 
|b(x)|* and does not depend upon the time. That is, the probability of 
finding the particle at any location is independent of the time. We say 
under these circumstances that the system is in a stationary state — 
stationary in the sense that there is no variation in the probabilities as 
a function of time. 

This situation is somewhat related to the uncertainty principle; for 
in a situation in which we know that the energy is exactly E we must be 
completely uncertain of the time. This is consonant with the idea that 
the properties of an atom in a specific state are absolutely independent 
of the time, so that at any time we would obtain the same result. 

Suppose that FE, is a possible energy for which Eq. (4.42) has a 
solution ġı (x) and that E> is another value for energy for which this 
equation has some other solution ¢2(z). Then we know two special 
solutions of the Schrodinger equation, namely, 


p(z, t) = pi (x)e7 CEt and galz, t) = da(a)e/* (4.43) 


Since the Schrödinger equation is linear, it is clear that if w is a solution, 
then so is cw. Furthermore, if Yı is a solution and 72 is a solution, then 
the sum Yı + we is also a solution. Evidently, then, the function 


w(x, t) = erdr(a)e~ Ft + eogo(x)e PEt (4.44) 


is also a solution of the Schrodinger equation. 

As a matter of fact, it can be shown that if all of the possible values 
of E and the corresponding functions ¢() are worked out, any solution 
w(a,t) of Eq. (4.14) can be written as a linear combination of these 
special solutions of definite energy. | 

The total probability to be anywhere is constant, as we showed in 
Sec. 4-1. This must be true no matter what the values of cı and ce, so 
that, using Eq. (4.44) for (x,t), we have 


Jord da = che foio da + ciee ®/ MEE foio, da (4.45) 
+ cuce ®/PE-E2)t fh p3 da + chen f b3b9 da 
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Since this must give a constant result, the time-varying terms (i.e., terms 
including et(¢/*)(#1—2)t) must vanish for all possible choices of cı and 
Cp. This means that 


f Ere f “ocon (4.46) 


When two functions f(x) and g(x) satisfy f f*(x)g(x) dx = 0, we say 
they are orthogonal. Thus Eq. (4.46) says that z A states of 
different energies are orthogonal. 

= s 5-2 we shall learn an interpretation for expressions such as 
f f*(x)g(x)dz, and we shall find that Eq. (4.46) records the fact that 
if a es is known to have energy Eı (and hence a wave function 
by = e C/A) Erto ), then the amplitude that it is found to have a different 
energy Es (i.e., wave function Y = e7 (/A)Ezt ġo), must be zero. 


Problem 4-8 Show from the fact that H is hermitian that E is 
real. (Hint: Choose f = g = ¢ in Eq. (4.30).) 


Problem 4-9 Show from the fact that H is hermitian that Eq. (4.46) 
holds. (Hint: Choose f = ¢2, g = ġı in Eq. (4.30).) 


Linear Combinations of Steady-state Functions. Suppose that 
our functions corresponding to the set of energy levels Ep, are not only 
orthogonal but also normalized, i.e., that the integral of the absolute 
square over all x is 1. Then we shall have 


/ ” $b a dat = bam (4.47) 


where ôn,m, the Kronecker delta, is defined by n,m = 0 if n # m and 
Onn = 1. Many functions can be expressed as a linear combination of 
such $,(x)’s. In particular, any function which is likely to arise as a 
wave function can be so expressed. That is, 


= ` anPn(T) (4.48) 


The coefficients an are easily obtained: multiply Eq. (4.48) by ¢7, (£) 
and integrate over all z to obtain 


F AOD ii ož (x) o,(2) dx = am (4.49) 


That is, 


=} ” 6% (a) f(a) de (4.50) 
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Thus we have the identity 


=Y on(z) f n(Y) f(y) dy = f È blae) f(y) dy 
z (4.51) 


This shows that, in terms of the Dirac delta function, 
(z — y) -5 bn| (4.52) 


It is possible to express the kernel K (b, a) in terms of these functions 
nlx) and energy values Ep. We do so by the following consideration. 
Let us ask this: If f(x) is the known wave function at the time ta, what 
is the wave function at time tẹ? It can be written at any time t as 


= 5 ene C/A Ento, (x) (4.53) 


n=l 
for it is a solution of the Schrödinger equation, and any solution can be 
written in this form. But at the time ta we have 


f(z) = (2, ta) = Da e7 C/B) Enta dp ( = Yo andafa) (4.54) 


n=l 


since we can always express f(x) in the eas of Eq. (4.48). So we 
conclude 


Cn = apet C/B Enta (4.55) 
Putting this into Eq. (4.53), we have 
(2.05) = ae eR) Ente g Cae Soane ee) (4.56) 


n=l nel 


Now using Eq. (4.50) for the coefficient an, we obtain 


Pleto) = X P(o) EE fg (y) Fy) dy 
n=l = 00 
a J S g, (a) Gh (yen Enta) f(y) dy (4.57) 
TOO n=1 


This final expression determines the wave function at time tẹ com- 
pletely in terms of f(x), the wave function at time ta. Previously we 
represented this relation by the equation 


w(x, ty) == f Kou a dy (4.58) 
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Comparing Eqs. (4.57) and (4.58), we finally obtain the desired expres- 
sion for the kernel K (b,a), 


bn(xp)O% (Laje CP Falta) for ty > te 
Kp bee tasta) = 2 ) (4.5 


0 for tp < ta 


This expression for K (b,a) is very useful for translating expressions 
to more conventional representations. It expresses the kernel, which was 
originally a path integral, entirely in terms of solutions of the differential 
equation (4.42). 


Problem 4-10 Verify that K (b,a) as expressed in Eq. (4.59) satis- 
fies the Schrödinger equation (4.29). 


Problem 4-11 Show that for free particles in three dimensions the 
solutions 


p(x) = E Re (4.60) 

go with the energy Ep = p*/2m. Consider the vector p as an index n 

and note the orthogonality. That is, as long as p Æ p’, 

/ bp (X) bp: (x) d°x = 0 even if Ep = Ep (4.61) 

Therefore the free-particle kernel must be 

Keats Xa, ta) = DD o (P/A) (xb—xa) o- (i/A)(p”/2M)(te—ta) (4.62) 
p 


Since the p’s are distributed over a continuum, the sum over the “in- 
dices” p is really equivalent to an integral over the values of p, namely 


d°p 
ye es fi Toms (4.63) 
p 
Therefore, we find that the free-particle kernel is given by 


a “(xp -X —(2 2/2m — d°p 
Ko(2p,to;Xasta) = | P/M p—xa) o- (i/ħ)(p?/2m) (ts we (4.64) 
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Problem 4-12 Carry out the integral in Eq. (4.64) by completing 
the square. Show that the correct free-particle kernel (i.e., the three- 
dimensional version of Eq. 3.3) results. 


NORMALIZING THE FREE-PARTICLE 
WAVE FUNCTIONS 


The derivation of the kernel for a free particle, as given in Prob. 4-11, 
is unsatisfactory for two related reasons. First, the idea of a sum over 
distinct states n used in Eq. (4.59) is not satisfactory if the states lie 
in a continuum, as they do for a free particle where any p is allowed. 
Second, the plane-wave functions for free particles, although orthogonal, 
cannot be normalized; that is, i Od = — l] dz = œ, so the 
condition of Eq. (4.47), used in deriving Eq. (4.59), is not satisfied. 
Both of these points can be remedied together in a perfectly straight- 
forward mathematical way. Starting all the way back when we expressed 
an arbitrary function as a sum of eigenfunctions, 


f(a) = > andn(a) (4.65) 


we allow part, or all, of the states to lie in a continuum, so that the sum 
over n must be replaced partly by an integral. With mathematical care 
one can find the correct expression for K analagous to Eq. (4.59) but 
applying also when the states are in a continuum. 


Normalizing in a Box. Many physicists prefer another, less rig- 
orous approach. They modify the original problem in a way that (from 
physical reasoning) will not essentially modify the result yet will leave 
all the states separate in energy and all the simple sums as simple sums. 
In our example this may be accomplished as follows. We are studying 
the amplitude that in a finite time a particle goes from Xa at time ta to 
xp at time ty. Now if these two points are some finite distance apart and 
the time is not extremely long, surely it can make no appreciable dif- 
ference to the amplitude whether the electron is really free or is instead 
confined to some enormous box of volume “Vol” with walls very, very 
far from Xa and x». The amplitude could be affected only if the particle 
could run out to the walls and back in the time t, — t,; but if the walls 
are far enough away, there is no appreciable amplitude for this. 

In is always possible that this assumption fails for some special- 
shaped walls such that, for example, x, is at a focus of waves from Xa 
reflected at the walls. From time to time someone lets an error creep in 
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by replacing a system in empty space with one at the center of a large 
spherical box. The fact that this system remains at the exact center of a 
perfect sphere may have an effect (like the spot of light at the center of 
the shadow of a perfectly circular object) which does not vanish as the 
sphere radius goes to infinity. For another shape, or a system off-center 
to the sphere, the surface effect would vanish. 

Take first the case of one dimension. In empty space the spack- 
dependent wave functions are e*?/")* (any p, positive or negative). If, 
instead, the range of x is limited to —L/2 to +L/2, say, what are the 
functions ¢(xz)? The answer depends on the boundary conditions defin- 
ing d(@) at x = —L/2 and x = +L/2. The easiest conditions to un- 
derstand physically are those for walls which offer very high repulsive 
potentials to the particle, thus confining it (i.e., perfect reflectors). They 
correspond to $(z) = 0 at x = —L/2 and x = + L/2. The solutions of 
the wave equation 

2 92 
ec se 5 (4.66) 


2m Ox? 
in the range |z| < L/2 are, for E = p*/2m = °k? /2m, 


pike ma o` ike 

or any linear combination. Neither e*”” nor e~**” can satisfy the bound- 
ary conditions, but with k = nz/L (n an integer) satisfactory solutions 
are given by half the sum (which is cos(kx)) for n odd and i/2 times 
the difference (which is sin(kz)) for n even, as diagrammed in Fig. 4-1. 
Thus the states are sines and cosines and the energy levels are separated 
(i.e. not in a continuum). 

If the solutions are written as 


2 2 
E cos(ka) and (2 sin(kx) 


then they are normalized, since 


i (7 cost) oe) (4.67) 


A sum over states is a sum over n. If we consider, say, the sine wave 
functions (thus, even values of n) for very large L but not large x (walls 
far from the point of interest), the successive functions differ by only a 
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g=-L/2 2 =0 g=+L/2 


small amount. This difference 


E sin (2n(n + 1)=) — sin (2nn— )| 





P2 2n+la\ . 12x 
= Z [20s (27 5 z) sin (2057) 


22 
~ UEF cos (21(n + 1)=) 


Fig. 4-1 The form of the one- 
dimensional wave functions which 
have been normalized in a box. 
The first four are shown. The 
corresponding energy levels are 
Ey, = Kn’ /Q2mL?, Eo = 453, 
Es = 9E, and E4 = 161. The 
magnitude of the energy in abso- 
lute terms, which depends on the 
size of our fictitious box, is not 
important for more realistic prob- 
lems. Rather, it is the relation be- 
tween the energy levels of the var- 
ious states which has significance. 


(4.68) 


is approximately proportional to the small quantity z/L. So a sum on 
n can be replaced by an integral over k = 2mn/L. Since the successive 
allowed values of k (for sine functions) are spaced by 2r/L, there are 
(L/2r)Ak states in range Ak. All of this applies also to states with the 
cosine wave function, so that we may replace sums by integrals in our 
formulas with 


CO CO L 
2! )— | ( )5— dk (4.69) 


and remember to add the result for the two kinds of wave functions, 
namely, ,/2/L cos(kxz) and ,/2/L sin(kz). 

It is often inconvenient to use sin(kx) and cos(kx) for the wave func- 
tions as we would like to use the linear combinations 


eFt — cos(kx) + isin(kx) and e** = cos(kx) — isin(kx) 


We were forced by our box to use sines and cosines and not the linear 
combination, because for a given k one, but not both, of the functions 
is a solution. But if we can disregard small errors arising from these 
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small differences in k, we might still expect to be able to get the correct 
results from these new linear combinations. Normalized, they are 


Fi a i 
. otk and - etka 


Since the wave e~*** can be thought of as e*** but for negative values 
of k, our new procedure, including the addition of the two kinds of wave 
functions, becomes the following practical rule: 

To deal with free-particle wave functions et”, normalize them to a 
range of x of length L (i.e., use (x) = \/1/Le**”), and replace sums on 
states by integrals over k with the rule that the number of states with k 
in the range k to k + dk is (L/2r) dk and the range of k is —oo to +00. 


Periodic Boundary Conditions. Sometimes this excursion into 
cosines and sines and back to exponentials is avoided by the following 
argument. The wall is artificial anyway, so its particular position and the 
particular boundary condition should not make any physical difference 
as long as it is far away. So instead of the physically simple conditions 
d(x) =O at z = +L/2 and at x = —L/2, let us use two others for which 
the solutions are indeed e*** directly. These are 


Ja (at = +5) ate (at p= -7) (4.70) 
‘ca 
o (x) (at t= +5) = ¢'(z) (at = -5) (4.71) 


These are called periodic boundary conditions, because the same ones 
would result by the requirement that (x) is periodic in z in all space 
with period x = L. It is readily verified that the functions ,/1/L ett” 
are solutions, normalized to range L, provided k = 27n/L with n an 
integer: positive, negative, or zero. From this our rule follows directly. 

In three dimensions we can see what happens by using a rectangu- 
lar box of sides Lz, Ly, Lz in the three directions. Let us use periodic 
boundary conditions. That is, the magnitude and first derivatives of a 
wave function at a point on one face are respectively equal to the mag- 
nitude and first derivative at the corresponding point on the opposite 
face. The normalized wave function for a free particle is | 


CE ie de ae. ae. ee bile 
— erat | — enuy | — e7 — ers 4.72 
Ly Ly L; V Vol ) 


where Vol = L,L,L, is the volume of the box and the allowed values 
of kg are 27n,/L, for nz an integer, those of ky are 2rny/Ly for ny an 
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integer, and those of k; are 27n,/L, for n, an integer. Furthermore, 
the number of solutions with kz in range dks, ky in dky, and k; in dk, 
is 
L L L Vol 
—dk,, —dk,, —dk, = ——= 
Qn On E A n) 


That is, use plane waves normalized to volume “Vol”: exp{ik-x}/v Vol. 
The number of states in range d?k (differential volume of k space) is 
Vol d?k/(27)?. 

Let us apply this to Prob. 4-11 and recall the connection between mo- 
mentum and wave number p = Ak brought out in Sec. 3-1. In Eq. (4.64) 
we must make two changes. First, since the wave functions used were 
exp{ip-x/h}, whereas we should have used 


1 ; [2x 
—=— €X 
va L A 
there should be an additional factor 1/Vol; for the product of two wave 
functions was involved. Second, the symbol 


dk (4.73) 








S ) must be replaced by Vol j () SE 


P 


This justifies what was done in Prob. 4-11. 

It is noted that the “Vol” factors cancel out, as indeed they must; 
for as Vol — oo the kernel K(b,a) must be independent of the size of 
the box. 


Some Remarks on Mathematical Rigor. The reader may have 
one of two reactions on seeing how the volumes “Vol” cancel at the 
end of this calculation. One might be: How nicely it cancels out as 
it should, for the walls have no effect. The other might be: Why do 
it in the complicated and “dirty” nonrigorous manner, putting in walls 
which make no difference, etc., when all this can be done much more 
elegantly and rigorously mathematically without the need of walls, etc? 
It depends on whether you are physically minded or mathematically 
minded. There are many misunderstandings between mathematicians 
and physicists on the place of mathematical rigor in physics, so perhaps 
a word as to the value of each method (the box or mathematical rigor) 
may be in order. 

There is, of course, the more trivial point: Which is most familiar 
— which takes the least new knowledge? Most physicists have seen this 
argument about how to count the states in a box before. 
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Another point is that the mathematically rigorous solution may not 
be physically rigorous. That is, the box may in fact exist. It may not be 
a rectangular box, but it is not often that experiments are done under 
the stars. Rather they are done in a room. Although it is physically 
reasonable that the walls have no effect, it is true that the original prob- 
lem is set up as an idealization. It is no more satisfactory idealization 
to move the walls to infinity than to replace them by perfect mirrors far 
away. The mathematical rigor is wasted in the first idealization, since 
the walls are not at infinity. 

The box approach is just as rigorous, or rather just as nonrigorous. It 
has several advantages. For example, in finding that the volume cancels 
out we do learn that at least one aspect of the idealized walls, namely 
how far away they are, is unimportant. This discovery makes us more 
intuitively convinced that the actual disposition of the real environment 
may be unimportant. Finally, the formula derived is very useful when 
in fact we do have a finite sample. For example, in Chap. 8 we shall use 
it to count sound-wave modes in a large, rectangular block of material. 

On the other hand, the advantage of the mathematically clean ar- 
gument is the avoidance of much unnecessary detail that cancels out. 
Although, using the box approach, one may learn something about how 
the walls have no effect, one may be firmly convinced that this is true 
anyway and not wish to descend into details to see it again. 

The normalization problem is a special example, but it illustrates 
the point. The physicist cannot understand the mathematician’s care 
in solving an idealized physical problem. The physicist knows the real 
problem is much more complicated. It has already been simplified by 
intuition, which discards the unimportant and often approximates the 
remainder. 


3 


Measurements and Operators 


5-1 


So far we have described quantum-mechanical systems as if we intended 
to measure only the coordinates of position and time. Indeed, all mea- 
surements of quantum-mechanical systems could be made to reduce 
eventually to position and time measurements (e.g., the position of a 
needle on a meter or the time of flight of a particle). Because of this 
possibility a theory formulated in terms of position measurements is 
complete enough in principle to describe all phenomena. Nevertheless, 
it is convenient to try to answer directly a question involving, say, a mea- 
surement of momentum without insisting that the ultimate recording of 
the equipment must be a position measurement and without having to 
analyze in detail that part of the apparatus which converts momentum 
to a recorded position. Thus, in this chapter, instead of concentrating 
on the amplitude that a particle has a definite position, we shall develop 
the idea of an amplitude to find a definite momentum, energy, or other 
physical quantity. 

In the first section of this chapter we shall show how a system may be 
described in terms of momentum and energy. The concepts learned here 
will be extended in the second section to describe in general various ways 
of representing the quantum-mechanical system. The transformation 
functions which enable us to go from one method of representation to 
another have many interesting properties. Among them is the concept 
of an operator, which was introduced in the preceding chapter and will 
be discussed further in the third section of this chapter. 


THE MOMENTUM REPRESENTATION 


The Momentum Amplitude. So far we have used the concept 
of probability in terms of the position of a particle, but suppose we 
wish to measure the momentum. Is there an amplitude (p) whose 
absolute square will give us the probability P(p) that a measurement of 
momentum will show that the particle has momentum p? There is in 
fact such an amplitude, and we can easily find it. 

Some ways of measuring momentum (or other physical quantities) 
correspond to measurements of position, and thus they can be analyzed 
if we know how to analyze coordinate measurements. For example, 
working in one dimension, suppose we have a particle whose position at 
t = 0 is localized within +b of the origin of the x axis. The uncertainty 
b can be as large as desired so long as it is finite. We can measure the 
momentum of such a particle by a time-of-flight technique. ‘That 1s, we 
can observe how far the particle has traveled (assuming no forces) by 
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the time t = T. If the position is y, then the velocity is y/T and the 
momentum is p = my/T. The error in such a momentum measurement, 
+mb/T, can be made as small as desired by making T sufficiently large. 

Suppose we analyze the momentum probability P(p) as defined by 
such an experiment. The probability P(p) dp that the momentum lies 
between p and p+ dp is the probability P (y) dy that, if all the potentials 
affecting the particle are suddenly turned off, then after the time T the 
particle will be found between the points y and y + dy. Of course, this 
requires that we connect p with y by p = my/T. Assume the wave 
function of the particle is given by f(x) at t = 0, and our problem is to 
find P(p) directly in terms of f(z). 

The amplitude for the particle to arrive at y at the time t = T is 


pun = | KTaO (5.1) 


Upon substitution for the free-particle kernel Ko (Eq. 3.3), this expres- 
sion becomes 


pun = (2) 


2mihT 
. 2 ee) ` 2 , 
imy im(—2yx + 2*) 
ee l AT J. oxP l IAT } F(a) dar 


The absolute square of this amplitude gives the probability that the 
particle lies between y and y + dy. According to our definition, this is 
identical (in the limit T — oo) with the probability that the momentum 
of the particle lies between p and p + dp. 


(5.2) 














mdy | [* im(—2yxr + x°) i 
y) dy = AT 1 -R l 2AT Pean 
= P(p) dp as T > œ (5.3) 


Then substituting p = my/T, and supposing that we pass to the limit 
of large 7’, there results 


dp | [® —ipx imax? 
P(p) dp = — — + —— d 
(p) dp = 5 [| n + gay | f(z) de 
We assumed earlier that, initially, the particle would be restricted 
to a region within +b of the origin. This means that the initial wave 
function f(x) drops to 0 for values of x larger in absolute magnitude 


than b. Now as T becomes large the quantity imb*/2hT becomes negli- 
gibly small. Since there is no contribution to the integral of Eq. (5.4) for 


(5.4) 
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Fig. 5-1 The amplitude for a particle traveling freely to arrive at position y after 
time interval T is determined by the convolution of two functions. The first is the 
amplitude f(x) for the particle to start at position x, as indicated by the shaded curve 
in the figure. The second, the amplitude to go from z to y, is the free particle kernel 
Ko(y, T; x, 0), indicated by the sine wave of slowly changing wavelength. (Shown here 
is Re{ ViKo(y, T; x,0)} as a function of x for fixed y and T.) If the point y is far from 
the origin, compared to the distance —b to +b over which f(x) is nonzero, the wave has 
an approximately constant wavelength near the origin. Its form there is approximately 
proportional to exp{(—i/h)(my/T)x}. The two functions are multiplied together and 
then integrated over x to find the amplitude for arrival at y. Since the particle has 
traveled approximately the distance y (again assuming y > b) in time T, this amplitude 
is equivalent to the amplitude that the particle has momentum p = my/T. 


values of x greater in absolute magnitude than b, the probability P(p) dp 
approaches dp/27h times the absolute square® of the amplitude! 


op) = f epf EE) rte) (5.5) 


=O 





An alternative explanation of this result is given in Fig. 5-1 and extended 
in Fig. 5-2. 

The expression for the momentum amplitude given by Eq. (5.5) ap- 
plies to a one-dimensional situation. It is easy to extend the definition 


‘Many writers prefer to account for the factor 1/27h in the definition of ¢(p), 
where it appears as 1/v2rħ. However, following the development of Sec. 4-3, we 
prefer to write it in the form we have used and remember that the differential ele- 
ment of momentum always includes the factor 1/27h in each dimension. For exam- 
ple, the differential element of momentum in three-dimensional momentum space is 


d°p/(2rh)?. 
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(a) 


() 


(c) 
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Fig. 5-2 If the amplitude f(z) 
is roughly periodic with the same 
wavelength as the overlying kernel, 
as shown in (a), then the integral of 
the product of the two functions is 
large. That is, the probability that 
the momentum is p = my/T is large. 

On the other had, suppose the 
wavelengths differ for some new 
function f'(x), as shown in (b). 
Then, when the product is taken, 
the contributions to the integral 
from different values of x tend to 
cancel each other. Now the prob- 
ability that the momentum is my/T 
is small. 

If a new position y’ is chosen as 
a final point, as shown in (c), thena 
new region of the kernel curve over- 
lies the space —b to +b. For a cor- 
rect choice of y’, the wavelength of 
the kernel in this new region is the 
same as the wavelength of f'(x) and 
a large probability results. That is, 
there is a large probability that such 
a particle has the new momentum 
value p = my’'/T. 
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to the three-dimensional case where the amplitude for the momentum is 


o(p) = | exp ms x) f(x) dx (5.6) 


where the wave function f(x) is now assumed to be defined for all points 
in the three-dimensional coordinate space. This is the amplitude that 
the particle has the momentum p at the time t = 0. (Note that it is not 
defined for the time t = T. The time interval T is part of the measur- 
ing equipment, and it can be varied without changing the momentum 
amplitude.) The square of this amplitude, multiplied by the differential 
momentum element, gives the probability of finding the momentum in 
the interval (three-dimensional) d°p/(27h)? of momentum space. 

We have analyzed a momentum measurement which is based on a 
time-of-flight technique. However, such an analysis can be applied to 
other techniques. The analysis of any technique for measuring momen- 
tum will give the same result for the momentum amplitude. For suppose 
we have two methods or techniques which purport to measure the same 
quantity, momentum. If one gives a different result than the other, we 
have to explain why one or the other apparatus is faulty. So if you will 
grant that the time-of-flight technique is an adequate way to define a 
momentum measurement, any other piece of equipment which measures 
momentum must give the same results P(p) dp for the distribution of 
momenta if the system is in the state f(z). Analysis of any equipment 
which measures momentum must give the same expression $(p) for the 
amplitude for momentum p, within possibly an irrelevant constant phase 
difference (i.e., a factor e*° with ô constant). For example, consider the 
following problem. 





Problem 5-1 Consider any piece of experimental equipment de- 
signed to measure momentum by means of a classical approximation, 
such as a magnetic field analyzer. Analyze the equipment by the meth- 
ods outlined in the preceding paragraphs. Show that the same result for 
the momentum amplitude is obtained. 


Transformation to Momentum Representation. We have called 
w(x,t) the amplitude for a particle to be at the point x at the time t. 
We have found that the momentum amplitude is given by 


o(p,t) = [ex en w(x, t) dx (5.7) 


We shall call this the amplitude that the particle has momentum p at 
the time f. 
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It is often useful to analyze problems in this momentum represen- 
tation rather than in the coordinate representation, or, as it is often 
stated, in momentum space rather than in coordinate space. Actually, 
the transformation from one representation to the other is just a Fourier 
transform. Thus if we have the momentum representation and wish to 
find the coordinate representation, we use the inverse transform given 


by 


v6) = few {=P} 6,9 SP, 6.8) 


We can describe this last formula in the same physical terms we have 
used to describe the structure of other amplitudes. The amplitude that 
the particle is at the position x is given by the sum over alternatives. 
In this case each alternative corresponds to the product of two terms. 
One of these is the amplitude that the momentum of the particle is p, 
given by (p). The other term, exp{ip-x/h}, is the amplitude that if the 
momentum is p, then the particle is at the position x. This second factor 
is not new to us, for we have discussed such an expression in Prob. 3-4. 

Note that in the transform of Eq. (5.7) the exponent has a minus 
sign. Such a term can be described in a manner parallel to that used 
in the preceding paragraph. Thus we say that exp{—ip-x/h} is the 
amplitude that if a particle is at position x, it has the momentum p. 


The Kernel in Momentum Representation. We have shown 
(Sec. 3-4) how a wave function at a particular time tẹ can be obtained 
from the wave function at an earlier time ta with the help of the kernel 
describing the motion of the particle in the intervening time. Thus 


W ( Xpy th) = | a, to: xar ta) (ata) dxa (5.9) 


It is possible to define a kernel in momentum space which would be used 
in a parallel expression. Thus the momentum amplitude at the time tẹ 
can be derived from the momentum amplitude at an earlier time ta by 
lpo, te) = | KB, tei Pas ta)6(Pas ta) 22S (5.10) 
Pb, tb) = Pb, 4b; Pa; ta Pa; la (Qrh)3 . 

Substituting in Eq. (5.9) for W(x, ta) the expression of Eq. (5.8) and 
taking the Fourier transform of w(x», ty) to get d(pp, ty), as in Eq. (5.7), 
we see that the kernel in momentum representation is given in terms of 
the kernel in coordinate representation by the expression 


K (pp, to} Pa, ta) = Jj en ORES ato Rara le Pe d°x, d?x, 
(5.11) 
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Fig. 5-3 The kernel for a free par- 
ticle in momentum space is unlike the 
kernel in coordinate space. In momen- 
tum space, there is only one path which 
can carry the particle to the momentum 
value pọ at the time tẹ. That single path 
must start at the momentum Pa = Ppp. 
No other paths contribute to the kernel. 


Pa ~ Pb 


For example, the kernel describing the motion of a free particle in 
momentum space is found by using Ko from Eq. (3.3) in Eq. (5.11). The 
result of the integration is 


Ko(Pp, to; Pa, ta) = (5.12) 
8537, — i (Pal? m 
(2mh)? 8? (Po — Pa) exp { oe (ty ta) for tp > ta 
0 for th < ta 


(The last line follows the convention of Eq. (4.28).) The occurrence of 
the Dirac delta function in this expression shows that the momentum 
of a free particle does not change, as diagrammed in Fig. 5-3. However, 
the phase of the momentum wave function changes continuously in ac- 
cordance with the factor e~(/")£* where E = p? /2m. This result given 
by Eq. (5.12) can also be seen directly from Eq. (4.64). 

This momentum-space kernel offers a much simpler representation 
of the free particle than does the coordinate-space kernel. Generally, 
when the particle is not free, but rather moves under the influence of 
a potential, the kernel in momentum representation loses its simplicity. 
But if the effect of the potential can be represented in a perturbation 
expansion, this simplicity is regained (Chap. 6). 


The Energy-Time Transformation. For many applications, par- 
ticularly in relativistic quantum mechanics, it is best to treat the vari- 
ables of space and time in a symmetric manner. Then in transforming 
from coordinate representation to momentum representation we include 
a transformation from time to energy. Thus the complete transformation 
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for a kernel is 


k(po, Ey; Pa, Ea) = JJJ J e7 (1/A)Po'Xxo o+ (t/h) Bote K (Xp, ty Xa ta) 
—Oo ta 
x eT C/R)Pa xa e—(/h) Fata dt, dta xe d’xa (5.13) 


The energy F is not equal to p*/2m, but is instead an extra independent 
variable (the coefficient of time) needed to define the kernel. Only if the 
system exists in the same energy state for an infinite time can an exact 
measurement of & be made to establish the relation between energy and 
momentum. 

As an example, we shall work out the kernel for a free particle. For 
this case the integrals over xX») and xg have already been worked out, 
with the results given in Eq. (5.12). Thus we are left with the integrals 
over t, and ta. Make the substitution tẹ = ta +7. Then the double 
integral can be written as 


CO OO 
f o(i/h)( By Ea)ta dta | oC/A)(Ep -p /2m)r gy (5.14) 
— CO 0 
The first of these two integrals is a representation of the Dirac delta 
function. In particular it is 27hdé(£, — Ea). The second integral is of 
the form 


J eT dr (5.15) 
0 


This latter integral arises often in quantum-mechanical problems. If w is 
a real number, the integral does not converge. In order to carry out the 
present calculation, we shall replace w with a complex number w + te. 
When both w and € are real numbers, with e > 0, the integral has the 
value i/(w + ie). | 

Now it would be possible to take the limit of this fraction as € ap- 
proaches 0 and interpret the result simply as z/w. However, such an 
interpretation would lead to incorrect (or rather, incomplete) results in 
further work. The function we are evaluating is a kernel, and in future 
work it will often be integrated (multiplied by some other function) over 
values of w or its equivalent. If € were dropped from the expression, then 
such integrals would have a pole at w = 0, and we would be at a loss 
what to do. 

It would not be correct to take just the principal part of the integral 
at such a pole. ‘This would give the wrong result. In particular, such a 
result would imply that the inverse transform of the kernel would not 
give back the original coordinate representation kernel with which we 
started. Such a transform would differ from the correct kernel in that it 
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would not be zero for values of time less than zero. One way to obtain 
the correct result from such integrals is to place the pole an infinitesimal 
distance above the real axis. This is accomplished by leaving e in the 
expression. 

If we rationalize the expression as 

i ilw — ie) iw E 
wie WEE e (LE) 





we can interpret the first term on the right-hand side as i/w and in 
further integrations use the principal part of an integral involving this 
term. The second term becomes 7é6(w) as € approaches 0, and it is 
to be interpreted as such in further integrations. That is, if a more 
precise mathematical definition is wanted, 7/(w + ie) should be replaced 
by P.P.(¢/w) + rô(w). This means that 


ae 1 
eT dr = lim 
0 EURU E 


-— PP. (=) a (5.17) 





(This result is recorded in the Appendix as Eq. A.7.) In all expressions 
containing €, a limit as €e — 0+ is implied. 

Returning to the evaluation of the kernel, we replace w with 
(E, — p2/2m)/h to find 

2rh)*5° (pa — — ih 
ko(po, Eb; Pa, Ea) = ee (5.18) 
The existence of the delta functions in this expression means that neither 
the energy E nor the momentum p changes during the motion of a free 
particle. These two quantities affect the motion of the particle as shown 
by the remaining pieces of this equation. That is, the amplitude for the 
motion from one point to another of a free particle with energy & and 
momentum p is proportional to i/(E — p*/2m + ie). 

Earlier in this section it was mentioned that the energy E is not 
in general identical to p*/2m, but is instead a separate variable. To 
understand the distinction, let us look at the kernel for a free particle, 
which is a wave-like function in time and space and wherein Æ is the 
coefficient of time and thus has the properties of a frequency. This 
kernel, given in Eq. (5.12), has the form shown in Fig. 5-4 when plotted 
against the time difference T = tp — tg. 
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Fig. 5-4 ‘The real part 
of the free particle ker- 
nel Ko as a function of 
time. ‘The function is 
zero for negative times, 
then starts with a sharp 
jump at T = 0 and con- 
tinues as a cosine wave of 
constant amplitude and 
frequency. 





Ko is zero for T less than zero, and it suddenly begins to oscillate 
at T = 0. The transformation from time to energy representation is 
equivalent to a Fourier transformation. Since the wave has a sharp 
beginning (at T = 0), the Fourier transform contains components at all 
frequencies and thus at all energies. If the function extends over a long 
time interval (many periods), then one frequency begins to dominate in 
the Fourier transform. For the free particle this dominating frequency 
corresponds to the energy Eo = p?/2m. 

It is for this reason that the free-particle kernel contains the factor 


= t Ee 
= PP. (z z) + rô (Ea — p4 /2mM) (5.19) 


Here the first term on the right accounts for the transient effects that 
result from the sudden start at T = 0. The second term gives the steady- 
state behavior and shows that, if we wait long enough, the only energy 
found is the usual p*/2m; but near T = 0 the energy is not given by this 
classical formula. 


1 
E, — p2/2m + ie 


Problem 5-2 If we transform only the time and not the spatial 
variables, defining 


Ki Lies taba j=j} et0/A) Erto K (ay li tasta je (i/h) Eata dty dta 
(5.20) 
show that for a system with a ances hamiltonian H 
| 2 bn (to) On (ta) 
k(xp, Ep; £a, Ea) = 20h id(E Be ake 7 we alte) (5.21) 


where En and n(x) are the eigenvalues and eigenfunctions of H. 
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MEASUREMENT OF QUANTUM-MECHANICAL VARIABLES 


The Characteristic Function. In the preceding section we have 
shown how an experiment designed to measure momentum leads to a 
definition of a probability distribution for the momentum. That is, from 
the results of a correctly designed experiment we can answer the ques- 
tion: What is the probability that the momentum of a particle is p? 
From the existence of a probability function for momentum we were led 
to the discovery of a wave function or amplitude written in terms of mo- 
mentum variables. In fact, we found that a system could be completely 
described and problems completely analyzed in a momentum-energy rep- 
resentation as well as the space-time representation which we have used 
heretofore. 

These same results apply to physical variables other then momentum. 
If any physical quantity can be measured experimentally, a probability 
function can be associated with it. That is, if an experiment is capable 
of measuring some characteristic A associated with a system (e.g., the x 
component of momentum), then after repeating the experiment several 
times it will be possible to construct the probability function P(a) which 
gives the probability that in any particular experiment the numerical 
values of A will be found to be equal to a. 

In general, it is possible to associate a probability amplitude with 
such a probability function. This amplitude would be defined in terms 
of the measured variable, together with other variables necessary to 
complete the specification. Let us see what is involved by generalizing 
our example of a momentum measurement. First we shall take just one 
dimension, but the extension to several dimensions will be obvious. We 
ask: Does the system have the property G? For example, G might stand 
for the statement: The value of the quantity A is equal to a. We must 
have some way to answer this experimentally. So let us imagine some 
equipment can be designed so that, if it has the property G, the particle 
will pass through the equipment and arrive at a certain location y on 
some screen or meter. 

The probability of this may be written 


[ keswa) (eae (5.22) 


where f(z) is the wave function of the system to be measured, Kexp(y, Z) 
is the kernel for going through the particular experimental apparatus, 


P(G) = 
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and y is the position of arrival for particles with the property G. This 
probability has the alternative mathematical form 





2 
6) =| J o x) dx (5.23) 
where we have defined 
g” (x) = Kexp(y, 2) (5.24) 


(Defining this as the complex conjugate of a function is just for conve- 
nience, as we shall see later.) So° we can say 


CoO 

v@)= | s@)fo)ar (5.25) 
— CO 

is the amplitude that the system has the property G. This concept is 

further described in Fig. 5-5. 

The property is defined by the function g*(x) for the following rea- 
son. Suppose that some other experiment with different equipment, and 
hence a different kernel Kexp(y’, £), should be built to measure the same 
property. In this second experiment the particle arrives at y’. Then the 
probability of finding that the system has the property G is 


2 








| g(a) f(a) de 


— OO 


J. Kan (y/,2) f(a) del (5.26) 


Since the property measured is the same, we must obtain the same 
result in every case for P(G) as we did with the previous experiment. 


Fig. 5-5 A device designed 
to measure the property G is 
placed between the incoming par- 
ticle (with wave function f(z)) 
and the final point y. The equip- 
ment modifies the kernel for the 
motion (compare Figs. 5-1 and 
5-2), making it equal to g*(z). 
The product g*(x) f(x), integrated 
over x, is the amplitude to ar- 
rive at y after passing through the 
equipment. 
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That is to say, we must have 


(5.27) 





- T g” (x) f(x) dx 


for any arbitrary function f(x). This means g*(x) = g’*(x) within at 
least an unimportant constant phase factor etf. That is, all methods to 
determine the same property correspond (within a phase) to the same 
g*(x). For this reason we call g*(x) the characteristic function of the 
property G. 

We may ask another question. What must the state f(x) be so that it 
is sure to have the property G? (For example, what is the wave function 
for a particle whose momentum is definite?) That is, we wish to find 
an f(x), say F(x), so that the particle going through the apparatus will 
certainly arrive at y and at no other point y. The amplitude to arrive at 
y should be proportional to 6(y — ğ) (that is, zero unless ğ = y). Hence 


AOO 


— CO 





[Koo (t,0)F(e) de = 5 ù) (5.28) 


This we can solve by the relation of the complex conjugate of a kernel 
to its inverse, discussed in Sec. 4-1. We have from Eq. (4.37) 


J Koal DK l2) = 6-0) (5.29) 
so that 
F(2) = Kip lV, 2) = g(a) (5.30) 


That is, g(x) is the wave function of a particle having the property G 
with certainly. We can say either (1) the particle has the property G or 
(2) the particle is in the state g(x). So we find: If a particle is in a state 
f(x), the amplitude that it will be found in a state g(x) is 


w= |g (a)fa)ae (5.31) 


For more dimensions, x becomes a space of several variables. 

We might say loosely: The probability that the particle is in the state 
g(x) is | f g*(x) f(x) dx|*. This is all right if we know what we mean. 
The system is in state f(x), so it is not in g(x); but if a measurement 
is made to ask if it is also in g(x), the answer will be affirmative with 
probability 


(5.32) 
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A measurement which asks “Is the state g(x)?” will always have the 
answer yes if the wave function actually is g(x). For all other wave 
functions, repetition of the experiment will result in yes some fraction P 
(between 0 and 1) of the tries. This is a central result for the probabilistic 
interpretation of the theory of quantum mechanics. 

From all of this we deduce an interesting inverse relationship be- 
tween a wave function and its complex conjugate. In accordance with 
the interpretation of Eq. (5.25), g*(x) is the amplitude that if a sys- 
tem is at position x, then it has the property G. (Such a statement is 
put mathematically by substituting a Dirac delta function for f(a) in 
Eq. (5.31).) On the other hand, g(x) is the amplitude that if the system 
has the property G, it is at position x. (This is just a way of giving the 
definition of a wave function.) One function gives the amplitude for: If 
A, then B. The other function gives the amplitude for: If B, then A. 
The inversion is accomplished simply by taking the complex conjugate. 

Equation (5.31) can be interpreted as follows: The amplitude that 
a system has property G is (1) the amplitude f(x) that it is at x times 
(2) the amplitude g*(x) that if it is at x, it has property G, with this 
product summed over the alternatives z. | 


oO 


Problem 5-3 Assume f* (x) f(a) dx, which is the probability 


—Co 
that a particle of wave function f(x) is somewhere, has been normalized 
to the value 1. Under this constraint, show that the state f(x) which 
has the highest probability of having the property G is f(x) = g(z). 


Problem 5-4 Suppose the wave function for a system is w(x) at 
time tg. Suppose further that the behavior of the system is described by 
the kernel K (£p, tb; £a, ta) for motions in the interval tẹ > t > ta. Show 
that the probability that the system is found to be in the state X(x) at 
time ty is given by the square of the integral 


J J XUR Gh be Tatal U Ta) dLa dTe 


We call this integral the transition amplitude to go from state w(x) to 
state X(x). 


Measurements of Several Variables. In the considerations of the 
preceding section we assumed an ideal experiment, which means that no 
quantity besides A could be measured at the same time. That is, we do 
not allow that more than one g(x) would give the same result, but assert 
that the maximum possible amount of information has been obtained 
from the system by a measurement of A. 
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Now in reality it is common for several variables to determine the 
state of a system. For example, if only the x component of momentum is 
measured in a three-dimensional system, no definite g(x) can be defined. 
Both the wave functions exp{ip,z/h} and exp{tp,2/h — tpyy/h} give 
the same value p, for the z component of momentum. So if only py, is 
measured in a three-dimensional system, the particle could be moving 
with any component of momentum in the y direction and not change 
the outcome of the measurement. Nor need the particle come to some 
unique point in the measuring apparatus. All the particles which arrive 
at some line or set of points could have the same value for pz. 

Thus in general, we see that the wave function g(a) defines the prop- 
erty G as follows: A state described by the wave function g(x) is certain 
to have the property G. However, the converse is not necessarily true. 
That is, it is not certain that all sates having the property G are de- 
scribed by the wave function g(x). Only if G includes a specification 
of all the quantities that may be simultaneously measured is the wave 
function completely defined by G. Even then there remains an undefined 
(and unimportant) constant phase factor e”. 

It is easy to make the necessary extension of the characteristic func- 
tion g*(x) when the ideal experiment requires the measurement of more 
than one variable. Thus suppose we have a set of quantities which we 
shall call A, B, C, ..., and which can all be simultaneously measured 
in an experiment: For example, the x component of momentum, the y 
component of momentum, etc. Suppose we can completely describe the 
state of a system by specifying the numerical values a, b, c,... assigned 
to these quantities. That is, we completely describe the state by saying 
whether or not it has a certain property. In this case the property in 
question is that the value of A is a, the value of B is b, etc. Furthermore, 
suppose that no additional information (information not derivable from 
a knowledge of the numerical values of A, B, etc.) could be obtained 
simultaneously by any means. 

Imagine we have an experimental setup capable of measuring all 
these quantities, i.e., capable of telling us whether or not the state has 
the property that the value of A is a, etc. We shall call the characteristic 
function of such a property 


9" (1) = Xo,b,0,... (2) (5.33) 
This function is, of course, a function of the numerical values a, b, 
C, ... which the experiment is set up to measure, as well as the co- 


ordinate variable z. 
Suppose the system is in state f(x). Then the probability that the 
experiment would show that the value of A is a, the value of B is b, 
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etc. (i.e., the probability that the state has the property in question), is 


P(a,b,¢,...) = | Sine, EFE) az (5.34) 


Transformation Functions. Suppose the system is actually in the 
state Xa’wc’,...(£), that is, the value of A is a’, etc. Then with our 
experiment the probability of finding the system in a state described by 
a, b, c,... is zero unlessa=a’,b=0',c=c’,.... This means that, 
with suitable normalizing factors, we have 


OO 
Í Xa,b,e,.. (2)Xa b'o, (2) dæ = ô(a — a’)6(b—B')d(c—c')--- (5.35) 
— CoO 

The function X, bc... (£) is the amplitude that if the system is in 
the state described by a, b, c, ..., then it will be found at æ. The 
function X% ,.,..(v), which we have called the characteristic function, is 
the amplitude that, if the system is at x, it will be found in the state 
specified by a, b, c,.... 

If the system is in the state f(x), then 


OO 
Fanen =f Xipe.. (2) (2) de (6.36) 
— 0O 
is the amplitude to find the system in the state specified by A having 
the value a, B having the value b, etc. 

The quantities Fa,b,c,... are just as good a representation of the state 
as the function f(x,y, z,...). In fact, if we know the function Fa b,c,... we 
can reproduce the function f(x,y, z,...) by means of an inverse trans- 
formation. 

The function Fa,b,c,... is called the A, B, C, ... representation of 
the state. (In the preceding section we had an example of this in the 
momentum representation.) The function f(x,y, z,...) is the customary 
coordinate representation, or £, Y, Z, ... representation, of the state. 
Transformations between the two are carried out with the help of the 
functions X and X*. In particular, the function Katara Ci y,2,...) is the 
transformation function going from the xz, y, z, ... representation to 
the A, B, C, ... representation, while the function Xa,b,e,... (£, Y, Z,...) 
is the transformation function going in the opposite direction. Thus the 


inverse of the transformation given by Eq. (5.36) is 


Tz; Uy. 256 D = yes Ss” — Fite meget Y; Zy.. .) (5.37) 
a b C 


5-3 
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This says that the amplitude to be found at x is the amplitude Fa,b,c,... 
to be found with A = a, B = b, ... times the amplitude Xa.b,c,...(£) to 
be at x if A = a, B = b, etc., summed over alternatives a, b, c,.... 


Problem 5-5 Assume that the function f(x,y, 2,...) can be rep- 
resented by 


f(x,y; Byes o = ` `S >», at Ee Gee a Bech yO Byes .) (5.38) 
a b C 


By substituting this relation into Eq. (5.36), and using the orthogonal 
properties of X as defined by Eq. (5.35), show that Foy... = Fa,b,c..... 


Problem 5-6 Suppose A, B, and Č are the three cartesian com- 
ponents of momentum pz, Py, Pze What is the form of the function 
Xa b e(£, y, Zz)? Using the results of Sec. 5-2, verify the relations obtained 
in Sec. 5-1. 


Problem 5-7 Suppose that the A, B, Č, ... representation does 
not correspond to either coordinate representation or momentum repre- 
sentation, but instead is some third way of representing the state of the 
system. Suppose we know the function Xq,b,c,...(%,y, Z,-..) which per- 
mits us to transform back and forth between coordinate representation 
and A, B, C, ... representation. Suppose further that we know the 
transformation function necessary to transform back and forth between 
coordinate representation and momentum representation. What then is 
the function necessary for the transformation between momentum rep- 
resentation and A, B, C,... representation? 


OPERATORS 


Expected Values. We can develop a few further properties of these 
transformation functions. Let us try to answer this question: A system 
is in a state specified by the wave function f(x), and the quantity A 
is measured. If the measurement is repeated many times, what is the 
average value which will be obtained for A? We shall denote this average 
value (sometimes called the expected value) by the symbol (A). 

Suppose it is possible, in principle, to measure simultaneously sev- 
eral physical quantities A, B, C,..., where a measurement of A could 
produce any one of a continuous or discrete set of values {a}, a mea- 
surement of B could produce any one of a continuous or discrete set of 
values {b}, etc. The probability of obtaining one particular set of values 
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a, b, c,... is |Fube,...|*. So the probability of obtaining a particular 
value a in a measurement of A, irrespective of the values taken on by 
B,C,... (for example, if B, C, ... were not measured at all) is 


mee i a ee (5.39) 


In this equation summations are carried out over all possible values in 
the continuous or discrete sets of {b}, {c},.... 

The average, or expected, value resulting from a measurement of A is 
obtained by multiplying the probability of Eq. (5.39) by a and summing 
the result over all possible values of a. Thus 


Ay=S SOS oo al Faae,...[? (5.40) 
a b C 


The need for computing such expected values arises frequently in 
quantum-mechanical problems. It is useful to have available formulas 
which simplify such computations. This subject, the subject of oper- 
ators, was discussed briefly in Sec. 4-1. Now we shall develop a few 
additional results. However, nowhere in this book shall we attempt a 
really thorough study of operator calculus, since several excellent works 
along this line are already available.! 


The Operator. Let us try to express the expected value of A 
directly in terms of the original wave function f(x). Note first that the 
absolute square of Fa,b,c,... can be written as 


Biel == ee ee (5.41) 
Then, using Eq. (5.36), we can write 
EEE af Kase Od f Xine (NE 


p F f” (£) R(x) dx (5.42) 


In the second line of this equation we have made use of the substitution 


R) = A GALETE dz’ (5.43) 


AO 


where we have written 
Gal (x, x’) p? > P Xa, DCs E ee: (x) (5.44) 


1 For example, see P.A.M. Dirac, “The Principles of Quantum Mechanics,” Claren- 
don Press, Oxford, 1947. 
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Equation (5.43) says that the function R(x) results from the function 
f(x) as the result of an integration performed with the help of a linear 
integral operator Ga(x, x’) associated with the quantity A. Often an 
equation like Eq. (5.43) is symbolized by the notation 


R=Af (5.45) 


where A stands for linear operator which operates on the function f. 
In the present case A stands for the operation displayed on the right- 
hand side of Eq. (5.48), that is, multiplication by the function G4 and 
integration. ‘The operator A is associated with the physical quantity A. 
Using this notation, we can write 


= f f*(x)Af (x) dz = J. f f*(x)G4(a,x') f(a") da’ dz 
(5.46) 


Problem 5-8 Note that Eq. (5.44) implies G (x, x') = G(x’, x). 
With this in mind show that for any two wave functions g(x) and f(z), 
both of which approach 0 as x goes to +ooọ, 


| s@As(a) ae =f ga) fe) ae (5.47) 
Any operator, such as A, for which Eq. (5.47) holds is called hermitian 
(see Eq. 4.30). | 


Problem 5-9 The transformation function between space represen- 
tation and momentum representation is 


Naser) Serr (5.48) 


(see Prob. 5-6). Choose the physical quantity A as the momentum p, 
in the x direction. Show that the function G', is 


Gp, (2,2) = 26'(a — 2! )6(y — y')6(2 — 2!) (5.49) 


d 
where (x) = FE (x). With this result determine the operator corre- 


sponding to the x component of momentum and show that the expected 
value of this component of momentum can be written as 


(Px) =f f*( 0) do (5.50) 


5-3 Operators 115 


Problem 5-10 Suppose the quantity A corresponds to the x coor- 
dinate of position. Show that the correct formula for the expected value 
of x results when the function G4(z, x’) is taken to be 


Gol, x) = 26(a — 2')d(y — y')6(z — 2’) (5.51) 


and the operator corresponding to z is simply multiplication by g, that 
İS, 


Xf (a) = xf (x) (5.52) 


Kigenfunctions and Eigenvalues. The wave function Xa,b,c,... (£), 
as discussed in Sec. 5-2, shows a particularly simple behavior when sub- 
jected to the operation A. Thus 


Aine) = aXa,b,c,... (x) (5.53) 


Problem 5-11 Show that this last result is true. 


When a function X satisfies an equation such as (5.53), we say that 
X is an eigenfunction of the operator A associated with the eigenvalue a. 

If two physical quantities can be simultaneously measured, then the 
operators associated with these quantities, A and B, for example, satisfy 
an interesting relationship, namely, A(Bf) = B(Af). This relation says 
that the result of performing one operation after the other is the same 
regardless of the order in which the operations are performed. In this 
case the two operators are said to commute: 


fee i= DB oA 


In general, we cannot expect the commutation relation to hold be- 
tween operators, but in this special case it does. The reason for this is 
that if A and B are physical quantities which can be measured simul- 
taneously, they can form part of a set A, B, C,... of simultaneously 
measurable quantities with a single characteristic function Xa,b,c,.... If 
the operator B is substituted for A and the value b is substituted for a 
in Eq. (5.53), the result is still valid, so 


A(BX) = A(bX) = b(AX) = bax = abx (5.54) 
which is true, since a and b are just numbers. Now also 
B(AX) = B(aX) = a(BX) = abx (5.55) 


A comparison of these two equations proves the commutation of the 
operators A and B when acting upon any of the functions Xa,b,c,.... Since 
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both these operations are linear (i.e., they do not involve computations 
with higher powers of the function X), the commutation relation must 
also apply to any linear combination of the X functions. 

If the X functions constitute a “complete set” (which is typical) we 
can construct any function at all from such a linear combination. So 
the operation AB and the operation BA give the same result on any 
function; that is, they commute. 


Problem 5-12 Show that the x coordinate of position and the 
x coordinate of momentum are not simultaneously measurable quanti- 
ties. 


There are situations in which a set of commuting mathematical op- 
erators A, B, €, ... are already known and it is required to find the 
functions (the eigenfunctions) which are associated with them. This 
requires solving a set of equations such as 


AX=aX BX=bX CX=cX > (5.56) 


For example, suppose the so a momentum in the z, y, 2 
—-—, -—, -—. What are the 
| | i Ox’ i Oy’ 1 Oz 

eigenfunctions of this set of operators corresponding to a state in which 
Px has the value a, py has the value b, and p; has the value c? (These 
are, of course, the eigenvalues.) We must solve the equations 

h OX h OX h Ox 

1 Ox i OY i Oz 


directions pz, Py, Pz are given as 


cx (5.57) 


and the solution is some arbitrary constant times e@/*)(ee+bytez) | This 
agrees with our previous knowledge that a particle with a definite mo- 
mentum p has the wave function e(/ hype 


Interpretation of Energy Expansion. Various expressions in- 
volving ¢,(xz) can be interpreted more completely now. For example, 
consider the expansion in Eq. (4.59) of the kernel in terms of the solu- 
tions ¢,(x) of a constant hamiltonian 


K tata ==). Onl) Oana re) (5.58) 


We notice first that n(x) is the amplitude that if we are in energy 
state n, we are at position x. ‘Therefore, from our previous discussion 
(Sec. 5-2), d* (x) is the amplitude that if we are at x, we are in n. Now 
let us interpret Eq. (5.58) this way. The amplitude to get from position 
Za at time ta to position xp) at time tẹ is the sum over alternatives. This 
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time the alternatives will be divided into the various energy states in 
which the transition can be made. Thus we must sum over all of the 
energy states n the product of the following terms: 


1. ġž (£a), which is the amplitude that if we are at za, then 
we are in the energy state n. 

2. e7 U/h)En(te—ta) which is the amplitude to be in energy 
state n at the time ty if we were in the energy state n at 
the time ta.l 

3. @,,(2»), which is the amplitude to be found at x, when we 
are in the energy state n. 


Problem 5-13 Discuss the possibility of interpreting n(x) as a 
Xa,b,c,...(@) function discussed in Sec. 5-2. That is, say n(x) is the 
transformation function to go from the x representation to a represen- 
tation specified by n (energy representation). 


'There is no amplitude to change the state. That is the importance of these 
particular states ọn (x). 


The Perturbation Method 
in Quantum Mechanics 


6-1 


IF a quantum-mechanical system is subjected to a potential energy which 
introduces only quadratic terms into the action, then we have seen in 
Sec. 3-5 how the resulting motion can be determined with the path 
integral method. However, many of the interesting potentials which 
arise in quantum-mechanical problems are not of this special type and 
cannot be handled so easily. In this chapter we shall develop a method 
of treating more complicated potentials. The method which we discuss, 
called the perturbation expansion, is most useful when the potential is 
comparatively weak (compared, for instance, to the kinetic energy of the 
system), 

Although the perturbation expansion can be developed along strictly 
mathematical lines, it is capable of an interesting physical interpreta- 
tion. This interpretation, which we shall also present, leads to a deeper 
understanding of quantum-mechanical behavior. 

In the second section of this chapter we shall undertake a special 
application of the perturbation method. We shall consider the motion of 
an electron when it is scattered by an atom. In describing the scattering 
interaction we shall find useful the classical notion of a:cross-sectional 
area which the atom presents to the impinging electron. Although this 
area is related to the actual size of the atom, we shall find that its 
complete description depends upon the quantum-mechanical aspects of 
the interacting system. 


THE PERTURBATION EXPANSION 


The Terms of the Expansion. Suppose a particle is moving in a 
potential V (x,t). For the present, the motion will be restricted to one 
dimension. Then the kernel for motion between the points a and 6 is 


Ky (b,a) = [ exp © [ (+2 — V(z,t)) it} Delt) (6.1) 


The subscript notation Ky is used to remind us that the particle is in 
the potential V. The notation Ko denotes the kernel for the motion of 
a free particle. 

In some cases the kernel Ky can be determined by the methods 
already studied. For instance, in Sec. 3-6 we determined the kernel 
for the harmonic oscillator subject to an outside force f(t). Here the 


potential was (see Eq. 3.65) 


mw? 


Viet) = ; 


z’ — f(t)x (6.2) 
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In general, we have found that if the potential is quadratic in z, the 
kernel can be determined exactly, whereas if it is sufficiently slowly vary- 
ing, the semiclassical approximation is adequate. There are some other 
types of potentials which can be successfully treated with the help of 
Schrodinger’s equation. Now we are studying a technique which is often 
useful if the effect of the potential is small. 

Suppose the potential is small, or more precisely, suppose the time 
integral of the potential along a path is small compared to h. Then the 
part of the exponential of Eq. (6.1) which depends upon V(z,t) can be 
expanded as 


te 
exp oF Ved) it} = 
ta 


tp 


EE 
ae Hdt+ fe 
I~ yp Viet) sal) 


which is defined along any particular path z(t). 
Using this expansion in Eq. (6.1) results in 


Ky (b,a) = Ko(b,a) + KP (b,a) + KP (b,a) +- (6.4) 


tb 


V (x, t) it tee. (6.3) 


where 


b i tb m 
Ko(b,a) = / exp g gi it} Da (t) (6.5) 
a t 


a 


Ei i fm 2 
K(b,a) = -; | exp 5 ae it} V(a(s),s)dsDa(t) (6.6) 
ta 


a ta 


1 b . tp 
KC) (b,a) = E ox { 5 | = it} 
a ba 


tb tb 
x V (a(s), 8) ds V (x(s’), s) ds’ Da(t) (6.7) 
be ba 
and so forth. To avoid confusion in the integrals over V, we call the 
time variables s, s’, etc. 


Evaluation of the Terms. First consider the kernel K“). We wish 
to interchange the order of integration over the variable x and the path 
z(t). We write 


K) (b, a) = -Ż [ F(s) ds (6.8) 


a 


where 


F(s) = | exp | i À mg? it} V(2(s), s) Det) (6.9) 


ta 
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Fig. 6-1 A particle starts from a and moves 
as a free particle to c. Here it is acted upon, 
or scattered, by the potential V (æ(8), s) = Ve. 
Thereafter, it moves as a free particle to b. 
The amplitude for such a motion is given in 
Eq. (6.10). If this amplitude is integrated 
over all possible positions of the point c, the 
result is the first-order term in the perturba- 
tion expansion. 





The path integral F(s) can be described as follows. It is the sum over 
all paths of the free-particle amplitude. However, each path is weighed 
by the potential V(x(s),s), evaluated at time s. The only characteristic 
of the path x(t) which is involved in this particular V is the position of 
the path at the particular time t = s. This means that before and after 
the time s the paths involved in F(s) are the paths of an ordinary free 
particle. The situation is sketched in Fig. 6-1. 

Using the same arguments which led to Eq. (2.31), we divide each 
path into two parts, one before the time t = s and one after this time. 
To be specific, we shall assume that each path goes through the point 
£e at this division time. Later on we shall integrate over all values of 
Le. If we denote the point x,(s) by c (that is, s = te), then the sum 
over all such paths can be written as Ko(b,c)Ko(c,a). This means that 
F(s) = F(t.) can be written as 


F(t.) = J Ko(b,c)V (®o,te)Ko(c, a) date (6.10) 
Substituting this into Eq. (6.8) gives [with V (c) = V (ze, te)! 
K (b,a) --i f T Ko(b, c) V (c) Ko(c, a) dze dte (6.11) 


The path integral (6.6) has been evaluated as an ordinary integral (6.11). 

Here the limits on the integral over x have been written as too. 
In a practical problem the limits will be established by the potential 
(which in most cases drops to 0 when x becomes very large) or by the 
equipment, which restricts the range of x. 


Interpretation of the Terms. Equation (6.11) is very important 
and very useful, so we shall develop a special interpretation to help think 
about it physically. We call the interaction between the potential and 
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the particle a scattering; thus we say that the potential scatters the 
particle and that the amplitude to be scattered by a potential is —(1/h)V 
per unit volume and per unit time. 

With this interpretation we can describe Ky in the following way. 
Ky is, of course, a sum over alternative ways in which the particle may 
move from point a to point b. The alternatives are: 


1. The particle may not be scattered at all [Ko(b, a)l. 

2. The particle may be scattered once [K (b, a)]. 

3. The particle may be scattered twice [K@)(b, a)]. 
Etc. 


In accordance with this interpretation, the various paths of the particle 
are diagramed in Fig. 6-2. 

Each one of these alternatives is itself a sum over alternatives. Con- 
sider, for example, the kernel for a single scattering, K (b,a). 


2d 





o b Pa 





ty a & ad 


Fig. 6-2 In (1) a particle moves from a to b through the potential V without 
being scattered. The amplitude for this is Ko(b,a). In (2) the particle is 
scattered once at c as it moves through the potential V. The amplitude 
for this is K (b,a). In (3) the particle is scattered twice with the amplitude 
KC) (b,a). And in (4) it is scattered n times, the last scattering taking place at 
c. The total amplitude for motion from a to b with any number of scatterings 
is Ko + KY ER Geerd K La 


b 
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One of the alternatives which comprise this kernel consists of the fol- 
lowing motion. The particle starts at point a, moves as a free particle 
to the point £e,te = c, is there scattered by the potential V(c), after 
which it moves as a free particle from point c to the final point b. The 
amplitude for this path is 


Ko(b, c) -5v0 dx. tte Ko(c, a) (6.12) 


(Remember that in our convention we are using the motion of the par- 
ticle is traced by reading the formulas from right to left.) 

The construction of this amplitude follows the rule stated in Sec. 2-5, 
namely, that the amplitudes for events occurring in succession in time 
multiply. The completed form for the kernel K 0) is obtained by adding 
up all such alternatives by integrating over x, and te, as indicated in 
Eq. (6.11). 

Using this reasoning, we can write down the kernel K) for double 
scattering immediately as 

i 


KDa = (=F) / Ko(b, c)V (c)Ko(c, d)V(d)Ko(d, a) dre dra 
(6.13) 


where dr = dxdt. Reading from right to left, this formula means: The 
particle moves as a free particle from a to d. At d the particle gets 
scattered by the potential V(d) at that point. It then moves as a free 
particle from d to c, where it is scattered by the potential V(c). After 
that it moves from c to b, again as a free particle. We sum over all the 
alternatives, namely, all places and times that the scattering may take 
place. 

Here we have tacitly assumed that te > tg. In order to avoid the 
complication of having to introduce this assumption explicitly in each 
such example, we shall make use of the convention adopted in Chap. 4 
(Eq. 4.28) and assume 


K(b,a) =0 for ty < ta (6.14) 


Then Eq. (6.13) is correct without restrictions on the range of integration 
of ba and bas 

The reader may wonder what happened to the factor 5 which appears 
in Eq. (6.7) but is omitted in Eq. (6.13). Note that in Eq. (6.13) the 
range of integration for tg is still from ta to tẹ; however, the range of te 
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has been restricted (by the definition of Eq. (6.14)) to lie between ta and 
ty. This restriction cuts the value of the double integral in half. To see 
this more clearly, suppose the double integral of Eq. (6.7) is rewritten 


as 
IT V(a(s),8)V(a(s'), 8°) ds'ds = (6.15) 
IT V (g(s), s)V(x(s’), s’) ds’ ds + I V(z(8), s)V (z(s'), s") ds’ ds 


The first term on the right-hand side of this equation satisfies the 
restrictions implied by Eq. (6.14). By interchanging the order of inte- 
gration, the second term on the right-hand side can be rewritten as 


| i | i veca rae 8") ds ds’ (6.16) 


If the variable names s and s’ are interchanged in the last expression, 
the value of the double integral remains the same. (This useful result 
is recorded in the Appendix as Eq. A.12.) This means that the first 
and second terms on the right-hand side of Eq. (6.15) are equal, so each 
one is half the value of the original double integral. This same sort of 
argument accounts for a factor 1/n! in the expression for K™), 


Problem 6-1 Suppose the potential can be written as U +V, where 
V is small but U is large. Suppose further that the kernel for motion 
in the potential of U alone can be worked out (for example, U might 
be quadratic in x and independent of time). Show that the motion in 
the complete potential U + V is described by Eqs. (6.4), (6.11), (6.13), 
and (6.14) with Ko replaced by Ky, where Ky is the kernel for motion 
in the potential U alone. Thus we can consider V as a perturbation 
on the potential U. We can say that —(z/h)V is the amplitude to be 
scattered by the perturbing part of the potential (per unit volume and 
per unit time). Ky is the amplitude for the motion in the system in the 
unperturbed potential U. | ) 


Problem 6-2 Suppose a system consists of two particles which 
interact only through a potential V (x,y), where x represents the coor- 
dinates of the first particle and y represents the coordinates of the second 
(see Sec. 3-8 and Eq. 3.75). Apart from this interaction, the particles 
are free. If V were 0, then K would be simply a product of the two 
free-particle kernels. Using this fact, develop a perturbation expansion 
for Ky (2p, Yb, to; Za, Ya, ta). By what rules of physical reasoning can the 
various terms in this expression be described? 
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AN INTEGRAL EQUATION FOR Ky 


Before applying the results of the preceding paragraphs to a special 
example, we shall develop some mathematical relations involving the 
kernels and wave functions of systems moving in a potential field. Using 
the results so far obtained, we can write Eq. (6.4) as follows: | 


Cone ee a ee =f Ko(b,c)V(c)Kole, a) dr, (6.17) 


+(- 3) Í Ko(b, c) V (c)Ko(c, d) V (d) Ko(d, a) dTe dra +: +: 


Alternatively, this expression could be written as 
Ky( b, a) = Kolb, a) (6.18) 
-4 | Kolb, oV © [Kole) aj) — 5 | Kol C, d) \V (d) Ko(d, a) dTa en dTe 


The expression in square brackets has the same form as Eq. (6.17). In 
both cases the sums extend over an infinite number of terms. This means 
that Ky can be written as 


Ky (b,a) = Ko(b, a) -$ | Kol (Oye) VC) (c,a) dre (6.19) 


which is an exact expression. This is an integral equation determining 
Ky if Ko is known. (Note that for the situation described in Prob. 6-1, 
Ko would be replaced by Ky.) Thus the path integral problem has been 
transformed into an integral equation. 

This last result can be understood physically in the following way. 
The total amplitude for the transition of the system from a to b, with any 
number of scatterings, can be expressed as the sum of two alternatives. 
The first alternative is the amplitude that the transition takes place with 
no scatterings, which is expressed by Ko. The second alternative is the 
amplitude that the transition takes place with one or more scatterings, 
which is given by the last term of Eq. (6. 19). In this last term the point 
c can be thought of as the point at which the last scattering takes place. 
Thus the system moves from a to c in the potential field with its motion 
exactly described by Ky(c,a). Then at point c the final scattering takes 
place, after which the system moves as a free system (without scattering) 
to the point b, as represented by the kernel Ko. This interpretation is 
diagramed in Fig. 6-3. 
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Fig. 6-3 In (1) the particle moves from a to b through the potential V as 
a free particle, described by the amplitude Ko(b,a). In (2) the particle is 
scattered one or more times by V, with the last scattering taking place at 
c. The motion from a to c is described by Ky(c,a), and that from c to b 
by Ko(b,c). A combination of the two situations, when all positions of c 
are accounted for, covers all possible cases and gives Ky(b,a) in the form of 
Eq. (6.19). 


Since the last scattering could take place at any point in space and 
time between a and b, the amplitude for this composite motion, repre- 
sented by the integrand of the last term of Eq. (6.19), must be integrated 
over all possible positions of the point c. 


Problem 6-3 For a free particle, Eq. (4.29) reduces to 
i | h 0? 


o 
— Ko(b, a) a 7 -3m O22 


Ôt Ko(b, a)! = Ô(tp — ta) (£o — Za) (6.20) 


Show, from this result and Eq. (6.19), that the kernel Ky satisfies the 
differential equation 
i | h’ o? 


ð 
— Ky (b,a) + z ~ Om ôx? 


3 Kv (b,a) +V@)Kv (ha) 


= 5(ty — ta)6(24 — Ta) (6.21) 


AN EXPANSION FOR THE WAVE FUNCTION 


In Sec. 3-4 we introduced the idea of a wave function and discussed 
some relations between wave functions and kernels. Equation (3.42) of 
that section shows how the wave function at time tẹ can be obtained 
from the wave function an earlier time t, with the help of the kernel 
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describing the motion of the system between the two times. For our 
present purposes this equation can be written as 


D= | Kv(b,0) (b,a) f(a) dza (6.22) 


where f(a) is the value of the wave function at the time t = ta (that 
is, f(a) is a function of za), Y(b) is the wave function at the later time 
t = tp,| and we suppose that between the two times the system is moving 
in the potential V, with the motion described by the kernel Ky (b, a). 

If the series expansion (Eq. 6.17) for Ky is substituted into this 
equation, the result will be a series expansion for Y(b). Thus 


= | Kolba) b, a)f a) dza (6.23) 


-; | Ko(b,c)V(0)Ko(c, a) dre f(a) daa ++ 


The first term of the series gives the wave function at the time ty 
assuming the system to be free (or unperturbed, in case Ky is to be 
substituted for Ko) between the time ta and ty. Call this term ¢. Thus 


b) = | Kolb,a) b, a)f a) d£a (6.24) 


Using this definition, the series of Eq. (6.23) can be rewritten as 
w(b) = — O- 5 J b,c)V(c)(c) dre (6.25) 


€ 3) J Ko(b, c) V (c) Kole, d) V (d)p(d) dTe dTa ++: 


In this form the series is called the Born expansion for Y. If only the 
first two terms are included (thus only through first order in V), the 
result is the first Born approximation. It involves a single scattering by 
the potential V. This scattering occurs at the point c. Up to this point 
the system described by ~(c) is free; and after the scattering, the system 
moves from c to b, again free, and described by Ko(b,c). An integral 
must be taken over all the possible points at which the scattering occurs. 
If three terms of the series are used (thus through terms of second order 
in V), the result is called the second Born approximation, etc. 


‘Note that our convention that K (b, a) is zero for t < ta makes Eq. (6.22) invalid 
if tẹ < ta, but we shall not use it in this range of t values. 
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Problem 6-4 Using arguments similar to those leading to Eq. (6.19), 
show that the wave function ~(b) satisfies the integral equation 


i 
WO) =) -4 | Kolb, AVe dr (6.26) 
This integral equation is equivalent to the Schrödinger equation 
Op if A a 


Working in one dimension only, show how the Schrödinger equation may 
be deduced from the integral equation. 


THE SCATTERING OF AN ELECTRON BY AN ATOM 


Mathematical Treatment. We have developed the concepts and 
formulas of the perturbation treatment in a somewhat abstract frame- 
work. Now, to develop a physical understanding of the perturbation 
method, we shall discuss the specific problem of the scattering of a fast 
electron by an atom. We envision an experiment in which a beam of 
electrons bombards a target, such as a thin foil of metal, and then is 
collected by some suitable counter, as shown in Fig. 6-4. 





Fig. 6-4 Electrons boil off a hot filament at a, are screened into a beam by 
collimating holes in s and s’, and then strike a thin-foil target at O. Most of 
the electrons pass straight on without being scattered (if their energy is great 
enough and the target is thin enough), but some are deflected by interactions 
with the atoms in the target and scattered, for example, through an angle 8 
to b. As the counter at b is moved up and down, the relation between the 
relative number of scatterings and the scattering angle 0 can be measured. 
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Suppose the energy of the scattering particles is determined by a 
time-of-flight method. That is, we release an electron from the source at 
one time, say t = 0, and ask for the chance that it arrives at the counter 
after some delay T. We can then make direct use of our result for the 
amplitude K (b,a) to go from one place to another in a definite time. 

We shall simplify the problem by assuming that either the foil is so 
thin or the interaction is so weak that each electron can interact with, 
at most, one atom. Actually, this assumption is quite realistic for many 
scattering experiments. Furthermore, most multiple scatterings can be 
analyzed in terms of the simple scattering from one atom. ‘Thus we shall 
discuss the interaction between a single electron and a single atom. 

The center of the atom will be taken as the center of a coordinate 
system in which the electrons are released at the point a, as in Fig. 6-5, 
at the time t = 0. A counter placed at the point b tells us whether or 
not, at the time t = T, the electron arrives at the point b. We shall 
make the following approximations: 


1. The interaction can be represented by a first-order Born 
approximation. That is, the electron is scattered only 
once by the atom. 

2. The atom can be represented by a potential” V(r) fixed 
in space and constant in time. 


Actually, the atom presents a very complicated system interacting 
with the electron, and the interaction between the electron and the atom 
is really more complicated than can be represented by a simple potential 
V(r). The electron could excite or ionize the atom and lose energy in 





Fig. 6-5 The geometry of the scattering problem. The electron starts at a and moves 
as a free particle to c, where it is scattered by the atomic potential V(x.). After 
the scattering, it moves as a free particle to the counter at 6, which is located at 
the end of vector x» from the scattering center O. In this process, the electron has 
been scattered through the angle 6, measured from the direction of the nonscattered 
beam. This process corresponds to the first-order Born approximation. If the amplitude 
for two scatterings, say, at d and c, is included, then the result is the second-order 
approximation, etc. 
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the process. It can be shown, however, that if we consider only elastic 
collisions between the electron and the atom, so that the atom is in 
the same energy state after the collision as it is before, then when the 
approximation (1) is valid, approximation (2) is valid too. 

Let x, and x», be the vectors from the center of the atom to the 
points at which the electron is released and detected, respectively. In 
the calculations we shall take x, and x, to have lengths much larger 
than the radius of the atom. That is, we shall assume that the atomic 
potential V(r) becomes negligibly small at distances much smaller than 
Xal and |x|. Thus during most of its flight, the electron will be moving 
as a free particle, and only in the vicinity of the origin will it be exposed 
to the potential. 

The first-order Born approximation contains two terms, only the 
second of which is of interest to us here. The first term is the kernel 
Ko(b, a) for the motion of the electron from a to b as a free particle, and 
it has already been studied sufficiently. The term of interest is then 


K%(b,a) = -+ = | Kol (b,c) V(c)Ko(c, a) dre (6.28) 


ff (samt) T-t 5) i ORSAY 


x V (xe) m” imlxe = Xa!" | ay 8x 
aie te ine: 


Here we have used x, as the vector from the origin to the point c, and 
d°x, represents the product of the differentials of all the components of 
the vector xe. The integral over te gives (see Appendix, Eq. A.5) 


1 m 5/2 
(ba) =- = cd 


1 1 im 
> Rea + Roc)” > V (Xe) d?x, 
* Ile i) exp | sar E ) ) (x E 


where Reg = [Xe — Xq| and Ry, = |x» — x-|. Using these definitions, as 
well as ra = |Xq| and ry = [Xa], we write 

















2Xa'Xe [xel* vee 
Ram ta € — Ea + r2 Z) X ra tigX&e (6.30) 
9 Xo i 2 1/2 
Roe = Tb C = a ar Zc! j Th ip*X¢ (6.31) 
Fp r 


where i, and i, are unit vectors in the direction of —x, and x», respec- 
tively (that is, ig = —Xq/Tq), and we have made use in the approxima- 
tion the fact that ra is much larger than any value of |x,| for which the 
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potential is not negligible. It is necessary to keep the first-order terms 
in |x,| only in the argument of the exponential, since this factor is quite 
sensitive to small relative changes in phase. Here we need 


(Rea + Roc)? © (fa + 74)° + 2(ra + Tp) (iatX%e — ib'Xc) (6.32) 


Using these approximations, the kernel can be written as 


(1) T m yr 1,1 
Ka) ~—5 (soe) T oor znr e + 7e)? p (6.33) 


x [ex E ET (Ta + Ty) (ia*Xe — ix.) V (xe) d°x, 





Physical Interpretation. We can deduce some of the physical 
characteristics of the motion from a study of Eq. (6.33). In the time T 
the electron has traveled the total distance of rg + rẹ. Thus its velocity 
during this time is u = (ra + rẹ)/T and its energy is mu? /2, while its 
momentum has the magnitude mu. In writing these expressions we are 
making the assumption that the energy of the electron is not changed 
by the scattering process. 

That these values for the velocities, energy, and momentum are con- 
sistent can be verified from an inspection of the exponential factor ap- 
pearing in front of the integral of Eq. (6.33). The phase of this expo- 
nential term is im(rg + rb)” /2AT, and the derivative of this phase with 
respect to 7’ gives the frequency as 


M (Ta + 1p)" 


With u defined as above this means that the energy is mu*/2 (see 
Eq. 3.15). 


Differentiating the phase with respect to rẹ yields the wave number 
at the point b as 


M Ta TTo 
ie 


which means that the magnitude of the momentum is mu (see Eq. 3.12). 


p= 





(6.35) 


Problem 6-5 The integral over te in Eq. (6.28) can be performed 
approximately using the method of stationary phase. By studying the 
application of such a method to this integral, show that most of the 
contribution to the integral comes from values of te near the region 
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te = Tq/u, the time at which the electron would arrive at the center of 
the atom if it moved in a classical manner. 


With the velocity of the electron defined as u = (ra + Tp) = define 
the incoming vector momentum pa as 
Dao = Mul, (6.36) 


and the outgoing vector momentum p» as 


Po = Mulp (6.37) 
Then Eq. (6.33) can be written as 
) 5/2 imu? 
KY (b =i (- a o 
(0, a) A \Qqrth FP 2 (6.58) 


hi 
a 
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Call the negative change in momentum, or the momentum transfer, 


P = Pa ~ Pt 
and define the quantity v(Ď) as 


v(p) = [ee™ve) d?r (6.39) 


The probability that an electron arrives at the point b is given by 
the square of the absolute value of the kernel Ky (b,a). Thus the prob- 
ability will depend upon the first term in the series expansion of this 
kernel, namely, Ko(b,a), which is likely to be so large as to completely 
overshadow the small perturbation term K) (b,a). 

For this reason it is customary in most scattering experiments to 
collimate the incoming beam with suitable shields so that those elec- 
trons which are not scattered by the atoms in the target are confined to 
the region of a particular line (or direction), as shown in Fig. 6-6. Of 
course, there will be some diffraction by the collimating shields, such as 
that studied in Secs. 3-2 and 3-3, which means that some nonscattered 
electrons will appear outside this central beam. However, with suit- 
able collimation, and for positions suitably far away from.the collimated 
beam, the number of electrons diffracted by the collimator will be very 
small compared to the number scattered by the atoms in the target. 

In such a region the probability of arrival for an electron is given, 
at least to first order, by the square of the absolute value of K (1) (b, a) 
alone. Using Eqs. (6.38) and (6.39), this probability is 


eee eae Oe 7) 6.40 
unit volume fA? e ia jv(Ď)| ( ) 
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In this last expression the factor v(Ď) contains the characteristics of 
the atomic potential and the dependence of the kernel upon the relative 
directions of x, and x». It is completely independent of the dimensions 
of the experimental equipment. The effects of such dimensions are rep- 
resented by the remaining factors of Eq. (6.40). For example, the term 
1/r2 can be easily seen to result from the idea that the chance for an 
electron to actually hit the atom varies inversely as r2. The application 
of such an idea might be questioned in this experiment in view of the fact 
that we have supposed some collimating shields are present. However, 
this collimation has a negligible effect over atomic dimensions. From the 
point of view of a target atom the beam of oncoming electrons appears 
to consist of electrons spreading in all directions from a single source. 

In a similar manner, after the scattering, the electrons spread out 
again in all directions from the scattering atom. Thus the chance per 
unit volume to find an electron in the counter varies inversely as rz. 
Since the more interesting features of the experiment are contained in 
the function v(p), we shall give special attention to this function in the 
next section. 

The additional factors depend on the particular normalization of 
our kernel. We can interpret the formula more easily if we give it as 
a ratio. We compare the probability of finding a scattered particle 
at b to the probability of finding one at a point d behind the atom 
at the same total distance rg + rẹ (and at the same time T, to keep 
the velocity the same) if no scattering occurred, as shown in Fig. 6-7. 
That is, we calculate P(d) per unit volume, as if no atom were present. 





Fig. 6-6 Principle of collimation to eliminate the zero-order term at b. Only 
electrons which have been scattered at least once can get from a to b with 
any reasonable probability. ‘Thus the zero-order term in the perturbation ex- 
pansion of Ky (b,a) will contribute a negligible amount and can be neglected. 
The first term of importance is K® (b, a). 
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Fig. 6-7 If d and b are the same total distance from O, namely rə, then 
the difference (or ratio) of numbers of electrons arriving at the two points 
can depend only on the scattering phenomenon. If d is in the direct line of 
nonscattered electrons, the ratio of the number arriving at b to that arriving 
at d if no scattering source were present is the probability of scattering to b. 


The result is |Ko(d, a)|* or 


P(d) B m \3 u? 
unit volume (Sam) T (ra +1)? (6.41) 
so that 
ae te \ «Gidea te): 


We shall interpret the last factor geometrically in the next section, where 
we shall also give more detailed attention to the function v(p). 


The Cross Section for Scattering. It is convenient to describe 
the characteristics of an atom in a scattering experiment by means of the 
concept of a cross section. The utility of such a concept stems from the 
convenience of thinking along the lines of classical physics. The cross 
section do/dQ is defined as the effective target area (from a classical 
point of view) of the atom that must be hit by an electron in order 
that the electron be scattered into a unit solid angle. This solid angle 
is measured around a sphere whose center is at the atom. ‘The cross 
section is thus a function of the scattering angle, i.e., the angle between 
Xa and x». In terms of such a classical model we can determine the 
probability that an electron arrives at the point b. 

If particles starting from the origin were to hit a small target of area 
do at distance ra, these particles would be removed from the region 
d, where they would have spread out over an area [(rg + 75)/Tal? do. 
Instead they are sent out in a solid angle dQ toward b and are therefore 
spread out over an area pe dQ there, as shown in Fig. 6-8. Hence the 
ratio of the probability of finding them at b to that of finding them 
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Fig. 6-8 Particles striking an area do of the target are deflected through 
an angle 0 into an area measured by the solid angle dQ. If no target had 
been present, those particles would have proceeded to point d. Instead, they 
proceed to point b, spreading out into the area r? dQ. The probability of 
finding a particle at d is inversely proportional to the area over which the 
beam would have spread in arriving at d. Similarly, the probability of finding 
the particle at b is inversely proportional to the area r? dQ over which the beam 
of scattered particles spreads in traveling from the target to b. If we take the 
ratio of these areas, we have the inverse ratio of the associated probabilities. 
From this point of view we say that all of the particles which hit the target 
area do are scattered, and through a particular angle 0. Of course, actually 
only a few particles which hit the target are scattered at all and only a fraction 
of these through the angle 0. Thus, the area element do which we have used 
in this calculation is the effective cross-sectional area for scattering through 
the angle 0 measured in terms of the element of solid angle dQ) into which the 
particles are scattered. 


at d if there were no target is the inverse ratio of these areas, 


P(b) _ [(ra +170) /ral? do 


P(d) r2 dQ C 


On comparing Eqs. (6.42) and (6.43), we see that the cross section per 
unit solid angle is 


do m \* 
dQ > (5) u(p)|° (6.44) 

The main advantage of an expression in terms of cross section instead 
of using Eq. (6.40) directly is this: Equation (6.44) does not depend on 
particular experimental conditions, so the cross sections obtained in one 
or another experiment can be directly compared, whereas probabilities 
per unit volume cannot be. | 

It must be emphasized that this idea of an effective target is purely 
classical and is convenient in recording scattering probabilities. There is 
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no direct relation between it and the size of the atom, nor is the scatter- 
ing mechanism to be thought of as localized exactly over such an area. 
For example, the shadow one would expect to find classically behind the 
target is not there in the classical sense (with sharp boundaries); for 
since we are dealing with a wave phenomenon, there is diffraction into 
the shadow. 


Special Forms of the Atomic Potential. The results obtained 
when the atomic potential V(r) is assumed to have various forms are 
shown in the following problems. 


Problem 6-6 Suppose the potential is that of a central force. Thus 
V(r) = V(r). Show that v(p) can be written as 





ce E = | i (sin) V(r) rar (6.45) 


Suppose V(r) is the Coulomb potential —Ze?/r. In this case the integral 
for u(p) is oscillatory at the upper limit. But convergence of the integral 
can be artificially forced by introducing the factor e~* and then taking 
the limit of the result as e — 0. Following through this calculation, show 
that the cross section corresponds to the Rutherford cross section 
ACR uth B 4m? Z?et 27e 


Q poo 16(mu?/2)2 sinf (0/2) Pn 


where e = charge on a proton : 
p = 2psin(0/2) = 2musin(0/2) (6.47) 
0 = angle between the vectors ia and I, 


The result of Prob. 6-6 is, accidentally, exact. That is, the first- 
order Born approximation gives the exact value of the probability for 
scattering in a Coulomb potential. This does not mean the higher-order 
terms are zero; it means, rather, that they contribution only to the phase 
of the scattering amplitude. Since the probability is the absolute square 
of the amplitude, it is independent of the phase. Thus a first-order Born 
approximation, which gives the correct value for the probability, is not 
exact for the amplitude. This case of a Coulomb scattering is amusing, 
for there is also another accident. A completely classical treatment of 
such a scattering problem, i.e., treating the electrons as charged point 
masses, gives the same result. | 


Problem 6-7 Suppose the potential energy V(r) = —ed(r) is the 
result of a charge distribution p(r) so that 


V olr) = —4rp(r) (6.48) 
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By assuming that p(r) goes to 0 as |r| — oo, multiplying Eq. (6.48) by 
et(B/A)r and integrating twice over r, show that v(p) can be expressed 
in terms of p(r) as 


2 
v(p) = = J emgle) ar (6.49) 


An atom can be represented in terms of its charge density. At the 
nucleus the charge density is singular, so that it can be represented as a 
Dirac delta function of r of strength Ze, where Z is the atomic number 
of the nucleus. Then if pe(r) is the density of atomic electrons, v(p) is 


2 2 ay 
B) = [2 flat a (6.50) 


The quantity in the brackets is called the form factor for electron scat- 
tering. (Incidentally, a similar form factor appears in X-ray scattering. 
The theory of X-ray scattering shows that only the atomic electrons, 
and not the nucleus, contribute to the scattering. Thus the form factor 
for X-ray scattering is the same but with the Z omitted.) 


Problem 6-8 In an atom the potential follows the Coulomb law 
only for very small radii. As the radius is increased the atomic electrons 
gradually shield, or cancel out, the nuclear change until, for sufficiently 
large values of r, the potential is zero. The shielding effect of atomic 
electrons can be accounted for in a very rough approximate manner with 
the formula 


FA 2 
V(r) =-= ema (6.51) 


In this expression a is called the radius of the atom. It is not the same 
as the outer radius of the atom as used by chemists, but instead is given 
by ao/Z1/, where the Bohr radius is a9 = A” /me? = 0.0529 nm. 

Show that in such a potential 


An Ze? 


u(p) z ~ (B/Ry2 aF (6.52) 
and hence 
a (6.53) 


dQ (mu? /2)2[4 sin* (6/2) + (h/pa)2]? 
The total cross section or is defined as the integral of da/dQ over the 


unit sphere; thus 


Ar 
da 
, da (6.54) 
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In the present example show that 
> (2Ze* /uh)? 


(6.55) 


Problem 6-9 Suppose we introduce that fact that the atomic nu- 
cleus has a finite radius given by 


ry = 1.2 fm x (mass number)!/? (6.56) 


and assume that the nuclear change is distributed approximately uni- 
formly in a sphere of this radius. What is the effect of this assumption 
on the cross section for the scattering of electrons by atoms at large 
values of the momentum transfer p? 

Show how the nuclear radius can be determined along with some of 
the details of the nuclear charge distribution by making use of this effect. 
How large must the momentum p of the incoming electrons be in order 
to produce an appreciable effect? Would one observe more carefully the 
large or small scattering angles? Why? 

Note: In this type of experiment the required electron momentum is 
so high that actually the relativistic formula E = ,/(mc?)* + (pce)? me? 
must be used to find the kinetic energy. So, strictly, we should not 
be allowed to use nonrelativistic formulas to describe the interaction. 
However, the relations between momentum and wavelength and between 
energy and frequency are not changed in the relativistic region. Since it 
is the wavelength which determines the resolving power of this “electron 
microscope,” the momentum calculated by nonrelativistic formulas is 
still correct. 


Problem 6-10 Consider a diatomic molecule containing two atoms, 
A and B, arranged with their centers at the points given by the vectors 
a and b. Using the Born approximation, show that the amplitude for 
an electron to be scattered from such a molecule is 


KO) = PIM? Fa(B) RAR fa(B) (6.57) 


where fa and fpg are the amplitudes for scattering by the two atoms 
individually when each atom is located at the center of a coordinate 
system. (Within the Born approximation, these f values are real for 
spherically symmetric potentials.) The atomic binding does not change 
the charge distributions around the nuclei very much (except for very 
light nuclei such as hydrogen) because the binding forces affect only a 
few of the outermost electrons. 
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Using Eq. (6.57), show that the probability of scattering at a partic- 
ular value of p is proportional to f4 + f—+2fafp cos(p-d/h), where d 
is a — b. 


Problem 6-11 Suppose the diatomic molecules are oriented in a 
random fashion. Show that the electron scattering averaged over a group 
of such molecules is proportional to 


2 2 sin(|p||d| /f) 
alee Slja]/h 


How can this result be generalized to the case of polyatomic molecules? 


These results form the basis of electron diffraction techniques which 
make possible the determination of the form of molecules. The values 
of f computed through the Born approximation are real and the result 
is valid for electron energies usually used in diffraction experiments on 
molecules (the order of 1 keV). However, if the molecule includes the 
very heaviest atoms, such as uranium, the atomic potential is too large 
for the results to be adequately described by the Born approximation, 
and small corrections are necessary. 


Problem 6-12 Assume that V(r) is independent of time and show 
that the time integral of the second-order scattering term K) (b, a) gives 


1/ m “( m M Ret Rel ia 
KOM, / /=S= et te a 6.58 
( = 5 (5 Ji cd RbeRed Fda ) 


X exp E DAT (Roe + Rea + Ria } V (xa) V (Xe) dxe d°xq 








where the points a, d, c, and b are arranged as shown in Fig. 6-9. The 
term Rea stands for the distance from point d to point c, etc. 

Assume that V(r) becomes negligibly small at distances which are 
short compared to rg or Ty. Show that the cross section is given by 
da/dQ = |f|", where the scattering amplitude f, including the first- 
order term, is 


f= SI e~t(Ps/h) Xe Y (x, )et(Pa/h) xe dx. 
h 
etl P/A) Rea 
>Re ht) xe —____YV/(xq)e"(Pe/") x4 By, dx 
+3(=s) Jj x) a Red (4) i 
+ higher-order terms (6.59) 


Here pz is the momentum of the electron traveling in the direction of 
x, and pa is the momentum of the electron traveling in the direction 
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Fig. 6-9 ‘To increase the accuracy of scattering calculations, we can take 
account of second-order terms in the perturbation expansion. Here, as in 
Fig. 6-2 (3), we picture the electron as being scattered at two separate points 
in the atomic potential. ‘Thus the electron starts at a; moves as a free particle 
to d, where it is scattered; then moves as a free particle to c, where it is 
scattered again; and finally moves as a free particle to b, where it is collected 
by the counter. The points d and c can lie at any position in space. The 
atomic potential at these positions depends upon the radius vectors xg and 
Xc, measured from the center of the atom 0. 


of —x,. The magnitude of the momentum is p, and it is approximately 
unchanged by an elastic scattering of the electron from the (relatively 
massive) atom. 


One might expect that in a situation in which the Born approxima- 
tion is not adequate it would be worthwhile to compute the second-order 
term as a correction. But in practice it seems that in this application 
Eq. (6.59) is a kind of asymptotic series. If the second term makes an 
appreciable correction (say 10 per cent or more) the higher terms are 
not much smaller and the true correction cannot be gotten easily by this 
method. Of course, if it is a problem in which the errors of the Born 
approximation are small (say less than 1 per cent), the second term will 
be adequate to find the corrections. 


The Wave Function Treatment of Scattering. In the scattering 
experiment which we have described we have assumed that the initial 
state of the incoming electron was that of a free particle with momentum 
Pa. We have assumed that the value of the momentum is determined 
by a time-of-flight technique (i.e., the total time required to travel the 
distance rg + rẹ is T). 

It is not necessary to use such a technique. Any device which enables 
us to determine the momentum is equally satisfactory. So suppose we 
generalize our picture of scattering phenomena, with the help of the wave 
function method. 
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Suppose the incoming electrons have momentum pa and energy 
E, = p2/2m. Then the wave function for the incoming electrons is 


pice) Se Pee Eee (6.60) 


Then, using the first two terms of Eq. (6.25), the wave function for the 
outgoing electrons is, to first order, 


w(x, ty) = ot (Pa/h)-X» o—(t/h) Eate (6.61) 


- pty , 
7 : | J Ko (Xp, tb; Xe, te) (Xen t.)et(Pa/h) Xe e~(t/h) Bate dx dte 
0 


The first term represents the alternative of the particle passing through 
the potential region without scattering. The second term represents the 
alternative of the particle scattering, summed over all possible scattering 
locations. This second term is called w(x», ty), the scattered wave. 


Problem 6-13 Assume that V(r,t) is independent of time. Sub- 
stitute the free-particle kernel Ko into Eq. (6.61) and integrate over te 
to show” that 


wb (xp, ty) = e7 C/M Bate elton (6.62) 


m, et(p/h) Roc 
7 rh Roe 


where Ay, is the distance from the variable point of integration x, to the 
final point x, and p is the magnitude of the momentum of the electron. 

Once again suppose that the potential drops to 0 for distances which 
are short compared to either rg or rẹ. Show that Eq. (6.62) can be 
written as 


V (x, )ePa/M) Xe 8x, 


(6.63) 


a(p/h)r 
W(Xp, to) ae e` (i/A) Bate eipa/n an ts f(0) | 


Tp 


where the scattering amplitude f(@) is defined in terms of v(p) (see 
Eq. 6.39) as 
m 


IQ) =~ 


The last term of Eq. (6.63), f(@)e*/™"* /r,, can be thought of as the 
spatial part of the scattered wave function. It has the form of a spherical 
wave radiating outward from the center of the scattering atom. The am- 
plitude of this spherical wave at some particular scattering angle depends 
upon that angle through the function f(@) which, by Eq. (6.64), varies 
with the momentum transfer p. Thus the complete wave function for the 


u(p) (6.64) 
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Fig. 6-10 An electron, represented by its equivalent wave, moves toward 
the atom at 0. The strongest amplitude is for the electron to continue on 
undisturbed as a plane wave with momentum pa. A weaker amplitude is for 
the electron to be scattered and move away from O as a spherical wave. The 
resulting amplitude of ending up at some point b, located at xz relative to 
the atom at 0, is then made up of two parts. The first is the nonscattered 
amplitude given by the plane wave e'(Pa/h)'X> To this is added the scattered 
wave of spherical form e\?/""> /r, times the scattering amplitude f(@). The 
combination of these two waves gives the spatial part of the scattered wave 
function. 


electron after scattering can be thought of as the sum of two terms. The 
first term is the plane wave of the nonscattered alternative, et(Pe/#): xe. 
and the second term is the spherical wave of the scattered alternative, 
as indicated in Fig. 6-10. Use this point of view to derive the formula 
for the cross section da /dQ. 


Problem 6-14 Use the wave function approach to discuss the scat- 
tering of an electron from a sinusoidally oscillating field whose potential 
is given by 


V(x,t) = U(x) coswt (6.65) 


Show that in the first-order Born approximation the energy of the out- 
going wave is changed by either +hw or —hw. What happens in the 
higher-order terms? 


6-5 
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TIME-DEPENDENT PERTURBATIONS 
AND TRANSITION AMPLITUDES 


The Transition Amplitude. An especially useful form of the per- 
turbation theory occurs if the unperturbed problem corresponds to a 
potential U independent of time, for then we have seen in Eq. (4.59) 
that the unperturbed kernel can be expanded as (now in one dimension 
for convenience) 


Ku(b,a) = 2 bal (xp) *(aq)e W/V En (ete) for ty > ta (6.66) 


in terms of the eigenfunctions n(x) and eigenvalues E,, of the unper- 
turbed hamiltonian. Let us look at our series for Ky(b,a) after sub- 
stituting this expression for Ky. Writing out the first two terms, it is 
(compare Eq. 6.10) 


Ky (b,a) = È bal (xy) ye U/R) En (to—ta) 


to 
D, 2 bm (20) Big (Te)e HIM Em lto 
m n Yta Y= 


x V (£e, te) bp, (Le) G* (Laje Ba te—te) dg, dte 
eee (6.67) 


It is clear that within each term the variable x, will appear in some 
energy eigenfunction, like ¢*(x,), and the zy likewise, so we can always 
write Ky in the form 


Ky (b,a) = 2 Aml (ty, ta) Pm (20) O* (£a) (6.68) 


where the A’s are coefficients depending on ty, tg. We shall call these 
coefficients transition amplitudes. To zero order in V, this must reduce 
to Ky, so to this order Amn = Omne W/V E» (tote) If we expand à in a 
series in increasing orders of V, we obtain 


Amn = Omne S/R) Fe le—te) 4. 9) 4 A) 4... (6.69) 
and comparison to Eq. (6.67) shows 


į tb CO . 
N=- ff dina) (6.70) 
ta Y — © 
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Problem 6-15 Recall that in Prob. 5-4 we defined a particular 
integral as the transition amplitude to go from state y(x) to state X(x). 
Show that the function Amn satisfies this definition when the initial state 
is the eigenfunction ¢,(x) and the final state is the eigenfunction ¢(z). 


Define, for brevity, 
Vie J Oe) VV hast Ones) Ors (6.71) 


(This is called the matrix element of V between states m and n.) Then 
Eq. (6.70) can be written 


p to 
D = E Aias | Vian (tejet C/D Em-En)te dt, (6.72) 
This is an important result of the time-dependent perturbation theory. 

The coefficient Amn is the amplitude for the system to be found in 
state m at time tẹ if it was in state n at time ta. Suppose the wave 
function at ta was @n(£a). What is it at tẹ? Using Eq. (3.42), we can 
express the wave function at tẹ as 


J Ky (b, a)Pn(Ta) d£a = Da DD jn; (Te) J Pj (La) On (La) dTa 


— 


=Y djnd; (x) (6.73) 
J 


That is, the wave function at t, is in the form `S Crni Ora Lo). 


This expansion in terms of eigenfunctions was first introduced in 
Eq. (4.48). Now we can assign a deeper meaning to the constants Cm. 
We can interpret Cm as the amplitude that the system is in state m (£). 
In this particular case, Cm = Amn is the amplitude for the system to be 
in state @m(xz) at time tp if it was in n(x) at time ta. 

With no perturbation acting, a system once in state n is always in 
state n, with an amplitude varying in time. So, to zero order, 


Amn = Omne V/M En tote) 

We can interpret the first-order term by the rule (see Fig. 6-11) that: 
The amplitude to be scattered from state n to state m within a time dt 
is —(i/ħ)Vmn (t) dt. 


Problem 6-16 Interpret Eq. (6.71) as a sum over alternatives; i.e., 
identify the alternatives. 
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Fig. 6-11 A system initially in 
the nth energy state is subjected 
to a potential V which “scat- 





n+l n+l ters” the system into all of the 
states available to it. The ampli- 
n n tude for scattering into the mth 
state is proportional to Vmn. In 
n—] n-] particular, the amplitude to be 
scattered from the state n to the 
state m in the time interval dt is 
initial state final state (Y/R) Vinn (t) dt 


Problem 6-17 Interpret Eq. (6.72) by explaining the meaning of 
each term. Then explain and verify the equation for the second-order 
coefficient 


| ~-\ 2 pty be | 
Nan = | j | De amet) Ving (te) (6.74) 
t e 
Q Q J 
x e U/M)Es total a e aaa) dta | dte 


Problem 6-18 Derive and interpret the integral equation 


Amn to; la) == Smne O/P) Bm (te—ta) (6.75) 
i te 
a t / eml) Em (tote) S~ Ving te) Ajn (tes ta) dte 
ta ; 


Problem 6-19 Consider Amn(ty) as a function of the final time ty. 
Show, using either Eq. (6.75) or (6.69), that 


d 


1 
Ti, nn (to) = E EmAmn (to) T D Ving (to )Ajn (to) (6.76) 


J 


Give a direct physical interpretation of this result. Next, deduce this 
result directly from the Schrödinger equation. Hint: Use Eq. (6.73) and 
substitute into the Schrödinger equation. Note that Eq. (6.76), with 
the initial condition Amn(ta) = dmn, could be used to determine the A’s 
directly. 
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We can interpret all of the terms in Eq. (6.69) using the rule that 
—(i/h)Vinn(t) dt is the amplitude that the potential V will scatter (or 
induce a transition) from a state n to a state m during the time interval 
dt. We can go from the state n to the state m by 0, 1, 2, or more 
scatterings. We can go directly between the two states, i.e., with no 
scatterings, only if m = n. Thus the first term in the expansion is 
proportional to dmn.- 

The second term, given by Eq. (6.72), gives the amplitude that the 
transition will take place as the result of a single scattering. The ampli- 
tude for the particle to be found in the initial state n is e~ (*/") 2x (te—te) 
at the time te. (In this case the phrase “to be found in the state n” 
should be interpreted as “available for scattering from the state n by 
the potential V.”) The amplitude to be scattered by the potential V (te) 
from the state n to the state m is —(i/A)Vmn (te). Finally, the amplitude 
to be found in the state m (which in this case means “the amplitude that 
the state m shall be available to the particle at the time the scattering 
takes place”) at the time tp is proportional to e~(/")Em(t—te), This 
scattering (at time te) can take place at any time between ta and tp. 
Therefore, an integration over the time te is carried out between these 
two end points. 

The third term, given by Eq. (6.74), is the amplitude for a transition 
as the result of a double, or second-order, scattering. The first scattering 
takes the system from its initial state n to the intermediate state j at 
time tg. The system then stays in this state until the time te, when its 
availability for scattering is again measured by an exponential function, 
e—(/h)E5;(te—ta)| Another scattering takes place at the time te and carries 
the system from the state j to the state m. We integrate over all of the 
possible alternate times for the scatterings at tg and te, requiring only 
that tg be earlier than te. Next, we add over all the possible states 7 into 
which the system may have been scattered in the intermediate interval. 

The terms of Eq. (6.69), which we have just interpreted, give the 
results of the general time-dependent perturbation theory. It is applica- 
ble when the unperturbed system has a constant hamiltonian and thus 
definite energy values. Next, we shall study some special cases of this 
theory in more detail. 


First-order transitions. First, let us take the case that the final 
state m is different from the initial state n and let us consider only the 
first Born approximation, i.e., the second term in Eq. (6.69). The result 
will be applicable for small values of V. The amplitude that we make 
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the transition from n to m is 


' te 
b —(i/h)(Emte~Enta REAR ER 
AD, = — e C/N En ta Enta) [ Vian (tJe tA Yt ay (6.7) 
This is a very important special formula in the time-dependent pertur- 
bation theory. Suppose as a first example that V(z,t) = V(x) is not an 
explicit function of time. If we take the interval of time from 0 to T, 
then since Vmn is constant, we have 


T 
ND) et CRI ritimi a] = -4 Van | et (i/h)(Em—En)t gy 


0 
y et /h)(Em—-En)T _ 1 in 
ie saa os 
The probability of a transition during the time interval T is then 
2 
=) 2 Vm fy. Er- Bn) 6.7 
P(n > m) = [Ahal SAE sin JA (6.79) 


We see that for at least a long interval T this probability is a rapidly 
oscillating function of the energy difference Em — En. If Em and En 
differ appreciably, i.e., if |Em — En| > |Vinn|, this probability is very 
small. This means that the probability that the energy in the final state 
will be modified appreciably from that in the initial state by a very weak 
constant perturbation is very small. One might ask: How can the energy 
be expected to change at all by the large amount Em — En as a result 
of the small disturbance Vmn? The answer is that we have considered 
V to start suddenly at the time t = 0, and the definiteness of this time 
permits, by the uncertainty principle, a large uncertainty in the energy 
(see Eq. (5.19) and associated discussion). 


Problem 6-20 Suppose V is turned on and off slowly. For example, 
let V (x,t) = V (x)g(t), where g(t) is smooth, as shown in Fig. 6-12. 


or for t < 0 
1—se% for 0 < t < 7/2 
l— se for T/2<t<T 
ze VET) forT <t 


The rise time of the function g(t) is 1/y. Supposing that 1/y « T, 
show that the probability given by Eq. (6.79) is reduced by a factor 
{1 + [((Em — En)/hy]*}~?. In this definition of g(t) we still have a 
discontinuity in the second derivative with respect to time. Smoother 
functions make still further reductions. 
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g(t) 


0 T t 


Fig. 6-12 The potential effecting the transition from n to m is turned on and 
off slowly with the time variation g(t), shown here. As this time factor becomes 
smoother (e.g., as discontinuities appear in successively higher derivatives) the 
probability of a transition becomes smaller. 


If it should happen that Em and Ep are exactly the same energy, we 
find P(n > m) = |Vmn|?T?/h*. This grows as the square of the time. 
It means that a concept of “transition probability per unit time” is not 
meaningful in this case. This formula holds only for 7’ short enough 
that |Vmn|T < h. It turns out that if only two states of exactly the 
same perturbed energy are involved, the probability of being found in 
the first goes as cos? (|Vmn|T/A) and of being found in the second as 
sin?(|Vin|T'/h), while our formula is only a first approximation to this. 


Problem 6-21 Consider the special case that the perturbing po- 
tential V has no matrix elements except between the two states 1 and 
2; and further, suppose these states are degenerate, that is, suppose 
Ei = Eg. Let Vig = Vo; = v and let Vii, V22, and all other Vmn be zero. 
Show that 





YT? y*T4 VYT 
i on? 24h4 hi ie 
oT ev’ TP = vT 


Problem 6-22 In Prob. 6-21 we have Vi2 = Voi, so that Vio is real. 
Show that even if Vig is complex, the physical results are the same (let 
v = |Via)). 


Such systems swing back and forth from one state to the other. A 
further conclusion can be drawn from this result. Suppose the pertur- 
bation acts for an extremely long time so that |Vmn|T > A. Then if the 
system is investigated at an arbitrary time 7’, which is somewhat indef- 
inite, the probabilities of being in either the first or second state are, 
on the average, equal. That is, a small indefinite perturbation acting 
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for a very long time between two states at the same energy makes these 
states have equal probability. This will be useful when we discuss the 
theory of statistical mechanics in Chap. 10. 

The case of great importance is that in which the values allowed 
for Em, the energy of the final state, are not separate and discrete but 
lie in a continuum, or at least are extremely closely spaced. Let us 
say that p(£)dE is the number of states in the range of energy E to 
E+dE. Then we can ask for the probability to go to some state in this 
continuum. First we see that to go to any state for which |Em — Enl] is 
large is very unlikely. It is most likely that the final state will be one of 
nearly the same energy as the initial En (within an error +|Vmnl). The 
total chance to go into any state is 


Finance (Em 2 TEA a JE., 


The quantity {4sin |(Em — E,)T/2h|}/(Em — En)? is very large if 
Em ~ En, reaching a maximum of T*/h*, whereas it is much smaller if 
Em and En differ appreciably (relative to A/T), as shown in Fig. 6-13. 
Thus almọst all the contribution to the integral over Em comes when 
Em is in the neighborhood of the value Ep. 

If |Vinn| varies slowly enough with m that we can replace it with a 
typical value, and furthermore if p(Em) likewise does not vary rapidly, 


sin?z 


oe 


~3n -2m -n 0 it 2% 3m g 


Fig. 6-13 In this figure the energy difference Em — En is replaced by the vari- 
able x. When these two energies are approximately equal (thus x is very small) 
the function (sin? x)/x* approaches its maximum value. For large values of 
the difference the function becomes very small. Thus, in expressions involving 
this function, the most important contributions come from the central region, 
that is, the region where the two energies are approximately equal. 
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then to a good approximation we can replace the integral of Eq. (6.83) 
by the expression | 


4|Vinn|” P(En) J 5 ee dEm (6.84) 


CO 


Since is [(sin* x) /x*] dz = r, the integral of Eq. (6.84) has the value 


nT /2h and we obtain the result that the probability for a transition to 
some state in the continuum is 


P(n = m) = F Vmnl*o(En)E (6.85) 


and that the energy in the final state is the same as the energy in the 
initial state. 

From these results we can write the probability of a transition per 
unit time in the form 

wi 
— m) — “T My—m[?(E) (6.86) 
where Mn—m is called the matriz element for the transition and p(E) 
is the density of levels in the final state. In our case Mn—m is Vmn. I 
we went to a higher-order expansion of Amn, it would be more compli- 
cated. Another way to write this expression is that the probability of a 
transition per unit time from state n to some particular state m is 
dP(n — m) 
dt 


Then when we sum over a group of final states, only those with energies 
Em = En survive. Since S )—> J ( )p(Em)dEm, we get as a result 


2 
= = |Mn—m|?6(Em — En) (6.87) 


Eq. (6.86). C 

We may illustrate Eq. (6.86) by an example which we have previ- 
ously discussed from a different point of view, namely, the scattering 
of an electron in a potential (see Sec. 6-4). Suppose an otherwise free 
particle has an interaction with a potential V(r) and we wish to dis- 
cuss the scattering of this particle from an initial state of a definite 
momentum to a final state with another definite momentum in a new 
direction. We suppose that the state n, the initial state, is a plane wave 
of momentum pa so that the wave function ¢,(x) is e(Pa/ Pty yV Vol 
(where “Vol” is the volume of the enclosing box as described in Sec. 4-3). 
Likewise, suppose the final state is a plane wave of momentum py so that 
the wave function m(x) is e*P+/")'*/./Vol. The matrix element Vmn is 


1 | | i 
— > | e~i(Pi/ñ)x ipa /A)X Bye — : | 
Vmn wal Í VARIE aa a. oe 
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where Pp = Pg — Pp. In the scattering, the energy will be conserved so 
that p*/2m = p?/2m. This means that the magnitudes of momenta pa 
and p» are the same. Let us call that magnitude p, so that 


IPo| = |Pal = P 


By our usual convention for writing differential elements of momentum, 
the number of states which have their momenta in the volume element of 
momentum space d°py is Vol d’p,/(27h)? = Vol p° dpdQ/(2rh)*, where 
dQ) is the element of solid angle which contains the momentum vector 
Pè. An element dE of the energy range is connected to the element of 
momentum space by 


d 

dE =d (Z) Ae (6.89) 
2m m 

Thus the density of momentum states for particles traveling into the 

solid angle dQ?) is 


Vol | 
Ey = dQ-——~ 6.90 
p(£) = mp nR) (6.90) 
Substituting these relations into Eq. (6.86), we find the probability of 
transition per second into the element of solid angle dQ) to be 


dP mp dQ, ,, 
Hh orm Vol P) 
(27h)? Vol 

We define an effective target area or cross section for scattering into 
dQ as do (see Sec. 6-4). The number of particles that will hit this area 
in time dt is the cross-sectional area times the velocity of the particles 
coming in, Ug = Pa/m, times dt, times the density of incoming particles. 
Thus 
dP doug 


(6.91) 





dt Vol ena) 
Therefore the cross section is 

da m N? on 

dQ E al m 


which is exactly what we obtained in Eq. (6.44). 


Problem 6-23 Show that the same result is obtained for do /dù. if 
the wave functions @(x) have the specific normalization of unity for a 
box of unit volume, e.g., dn (x) = et (Pa/f) x 


Problem 6-24 Suppose that the potential V is periodic in time. 
For example, suppose V (x,t) = V(x)(e* +e~*"). Show that the prob- 
ability for a transition to take place is small unless the final state is one 
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of the two values (1) Egnai = Finiti + hw (corresponding to an absorp- 
tion of energy) or (2) Egnai = Finitiai — ñw (corresponding to an emission 
of energy). This means that Eq. (6.86) is unchanged, but the density of 
states p(£) must be calculated at these new values of Æ. Or, in analogy 
with Eq. (6.87), we have 


Pin — 2 
Pem Anan, Hg Baha) En +) 


(6.94) 


Problem 6-25 It has been argued that the equations of the elec- 
trodynamics must, like those of mechanics, be converted to a quantized 
form on the basis of the photoelectric effect. Here an electron of energy 
hw is occasionally emitted from a thin layer of metal under the influence 
of light of frequency w. Is this impossible if matter obeys the quantum 
laws but light is still represented as a continuous wave? What arguments 
can you adduce for the necessity of giving up a classical description of 
electrodynamics, in view of the results of Prob. 6-24? 


Problem 6-26 Suppose we have two discrete energy levels Æ and 
Fg, neither of which is in the continuum. Let a transition be induced by 
a potential of the form V (æ, t) = V (x)g(t). Show that the probability 
of transition is 


P(O > 2) = |Via|*|$(wo) tA (6.95) 
if g(t) is representable by the Fourier transform 
= yt dW 
zg twt —— 
=j owe > (6.96) 


and Wo = (E> = Fy) /h. 


If g(t) is a statistically irregular function familiar from the theory of 
noise (called filtered white noise), the value of ¢(w) given by the inverse 
transform 


T 
w- J gltje t dt (6.97) 
aT 

depends on the integration range T. If T is very large, |(wo)|* can 
be shown to be proportional to 7’. Thus we get a transition probability 
proportional to the time and to the “intensity” or “power” (mean-square 
value of g per second) at frequency wo per unit frequency range. In virtue 
of this, the probability for the transition of an atom in a continuous 
spectrum of light is proportional to (1) the exposure time and (2) the 
intensity of light at the frequency of absorption (Ez — E1 )/À. 
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The Higher-order Terms. It is interesting to look at the second- 
order term in the perturbation expansion. This term is of special im- 
portance in problems where Vmn = 0 for those particular states n and 
m of interest. Let us suppose that we have such a problem, and suppose 
further that there are other states 7 # n for which Vin 4 0. The first- 
order term is 0, and so long as m # n the zero-order term is likewise 
zero. Thus the lowest-order term which enters into the calculation of 
the transition amplitude is the second. 

Suppose that the potential V(x) is independent of t. Then the 
second-order term in the transition element is 2?) . and if T = tp — ta, 
we have from Eq. (6.74) | 


AO) et (i/R)(Emto—Enta) 


-\ 2 T pte 
= (-5) $7 Ving Vin J j et (i/h)(Bm—Bs)te g+(i/h)(Ei—En ta dt, dt, 
j 0 0 


} Se +(i/h)(E;—-En)te _ 1 

1 E 
Za 8S V Vin +(/A)(Em—-Ej)te~ OOOO oo a 
h 2 a i Í E; — En at 


Vin [et0/A(Em-En)T _ PER E 1 
7 -Dx Ha VinsVin . | a a ne ae) 


The sia of the two terms in brackets has the same time dependence as 
we have seen in our first-order result. Therefore if the second term is 
neglected for a moment we see that the net result would again be to make 
transitions to states where Em = En, with a probability proportional to 
T. The probability per unit time has the same form as Eq. (6.86) but 
with M,_.m now given by 


V2; 
Mam = BELA ae BLE 6.9 
EE, (6.99) 


If the states lie in a continuum, the sum becomes an integral. 

Equation (6.99) is correct in the circumstance that it is impossible 
to go by a first-order transition from state n to state m or to any state 
with the same energy as the initial state. Under these circumstances 
Vin = 0 for states such that E; = En. Then the second term in brackets 
in Eq. (6.98) is never large; for it cannot be large unless En — E; is 
nearly zero, and then V;, in the numerator is zero. All the effects come 
from the first term, and Eq. (6.99) is correct. Furthermore, in the sum 
over j in Eq. (6.98) there is no ambiguity at the pole where E; = Em; 
for the numerator vanishes at this same value of E;. 

On the other hand, in some situations it may be true that a first- 
order transition is possible to some other continuum state (e.g., a nucleus 
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may decay in more than one way). In such a case the sum in Eq. (6.99) 
is meaningless; for we must define what to do near the pole. It is the 
neglected second term in Eq. (6.98) which comes to our rescue here and 
shows that the correct expression for Mn—m (now including the first- 
order term for generality) is 


_ Ving Vin 


in the limit e — 0. How this comes about we shall now analyze. 

First we may notice that for large T we cannot get a large proba- 
bility of transition (proportional to T, that is) unless Ep and Em are 
practically equal (within about A/T}. This is evident for the first term 
in Eq. (6.98). For the second term large amplitudes can arise only if 
E; + Em; but if Em is not very close to En, the factor in front is a 
smooth function of E; for Ej near Em. Taking it as nearly constant 
for a small range near E; = Em, we see that the second term can be 
approximated as some constant times 


e(t/hjeT BP | 


€ 


€ 


where € = Em — E; is to be integrated over a small range, say —ô to +ô. 
But 


ô p(i/R)eT _] To/h piy _ 4 
Í Ee / : dy (6.101) 
= E —T5/n Y 


-f (2 i) ig 
—TS/h y y 


The first integral is that of an odd function and vanishes. The second 
approaches a finite limit as T — oo (and therefore as T/A — co). That 
lS, 


ae 
2i J Y dy = 2ri 
0 y 
so no large transition probability occurs. A large effect can arise only in 
case En and E,, are essentially equal, for then the double coincidence of 
the two poles from (E; — E,,)~* and (Em — E;)~' can make the second 
term important. Therefore, we continue the analysis, assuming Em and 
En are nearly equal. 
The sum over j in Eq. (6.98) can be divided into two regions by 
choosing a very small energy A and breaking the sum up into a part 
A for which |E; — En| > A and a part B for which |E; — E,| < A. 
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We choose A to be small enough that the factor Vin;V;, does not vary 
appreciably when E; varies about Ep over the energy range 2A. This 
is some finite energy, and we shall take T so long that A/T « A, which 
means that |En — Em| & A. 

First, for part A, |E; — Em| > A. Then the second term cannot 
become large, for its poles are avoided. Only the first contributes, and 
the contribution is 


ezt — IT 





a .102 
ay (6.102) 
where x = (1/A)(Em — En)T and 

(A) 
= Ving Vin 
a >, Bab 


The sum extends over all E; except for those within tA of Em. This 
sum is nearly independent of A, and as A — 0 it is the definition of a 
principal-value integral. That is, in the limit A — 0 we can write 


1 
where P.P. is the principal part and we have reinstated the first-order 
term, in case it does not vanish. 

For the region B we take VinjVjn to be constant at its value for 
E; — Em = 0. That is, we replace 


(6.103) 


(B) Em+A 


X VmjVmF (Ej) by |X Ving Vjnd(Ej — Em) / F(E;) dE; 
j j Em- 
(6.104) 
We can write this as bI, where 
b= DD Ving Vin 0(E; — Em) (6.105) 
j 
and 
Em+A (i/h)(Em—-En)T _ (t/h)(Em—E5)T _ 
1 1 
I =f ——__ eo e dE, 
E-A Ej— En Em — En Em — E; 
(6.106) 


Now we put (1/A)(Em — En)T = z and (1/A)(E; — En)T = y, so that 
(1/h)(E m—~ BP = ay, to get 


TA/h 4 ee | et(a—y) ssi 
e ay 6.107 
ie TINE y k t—yY | | 
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This integral is most easily evaluated by contour integration, imag- 
ining y as a complex variable and changing the contour. Instead of 
integrating on the straight line from —TA/hA to TA/h, we go on the 
semicircle of radius TA/h below the real axis. Since T'A/h is very large, 
the second term contributes negligibly; and since 


TA/h d 
J aE iT 
-TA/ħ Y 
on this contour, we get I = in(T/A)(e® —1)/z. 
Putting the A and B parts together, we get 
(e? —1)T 
ħ 


for the amplitude. This gives a probability for transition of the form 
Eq. (6.86) with 


Mn—m = a + irb (6.109) 
Vt S Ving Vin PP 
j 


(a + irb) (6.108) 


1 
E;—E, F inð( Ej — Em) 

In light of Eq. (A.10), the last bracket can be written (E; — Em — ie)! 
in the limit as € — 0, as we have written in Eq. (6.100). 

From Eq. (6.100) we learn then that even if no direct transition 
is possible from n to m, nevertheless the transition can occur, as we 
say, through a virtual state. That is, we can imagine that the system 
goes from n to J, then from j to m. The amplitude for an indirect 
transition process is given by Eq. (6.99). We note that it is not right to 
say that it actually goes through one or another intermediate state 7, 
but rather that in characteristic quantum-mechanical fashion there is a 
certain amplitude to go via the various intermediate states 7, and the 
contributions interfere. 

The intermediate states are not of the same energy as the initial and 
final states. The conservation of energy is not violated, for the virtual 
state is not permanently occupied. The strength of contribution to the 
sum varies inversely with this energy discrepancy. 

There is nothing absolute about these intermediate states. They 
come from considering V as a perturbation to a system H and from 
speaking about the true states of H +V in terms of those of H alone. If 
other separations are made as to what is the “unperturbed” problem and 
what is the “perturbation,” different formulas and intermediate states 
will arise in the description. 

When the potential depends upon time (e.g., periodically), many in- 
teresting effects result. Most of these have been observed in microwave 
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experiments, where the perturbation V (x,t) is a weak electric or mag- 
netic field with a periodic variation in time. 


Problem 6-27 Derive the perturbation expansion up through the 
terms of the second order for potentials periodic in time. 


Sometimes a transition cannot take place except by the use of two or 
more intermediate virtual states. Analysis of such transitions requires 
the calculation of third- and higher-order terms in the perturbation ex- 
pansion. 


Problem 6-28 Show that -when a transition is impossible either 
directly or through a single intermediate state, but requires the use of 
two intermediate states, it is determined through the matrix element 


EER Ne u 
>, Ra (E; — En)(Ex — En) ee 


This corresponds to the third-order term in the perturbation expansion. 


Problem 6-29 Suppose two perturbations, V (x,t) and U(z,t), are 
acting. (Examples include a combination of DC and AC electric fields or 
a combination of electric and magnetic fields.) Suppose further that a 
certain transition cannot occur with either V or U alone, but can occur 
only when both act together. Under the special assumption that both V 
and U are constant in time, show that the matrix element determining 
the transition element is given by 


Ving Uin + Um Vj 
2, EOE FE (6.111) 


Next, suppose both potentials are periodic in time but have different 
frequencies, wy and wy. What then is the matrix element? 
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Calculation of the Change in Energy of the State. In comput- 
ing transition amplitudes we have considered only those states m Æ n. 
Suppose we turn our attention to the term m = n. Considering the zero- 
and first-order terms in the perturbation expansion, we have 


> pT 
et G/M EnT) an = 1 — : Van (t) dt (6.112) 
0 

If V is constant in time, this gives 1 — (i/A)VnnT. What is the meaning 
of this result? As a consequence of the introduction of the additional 
potential V into the original hamiltonian we can expect the energies of 
all the states of the system to be slightly altered. We can write the new 
energy of the state n as En + AEn. The time-dependent portion of the 
wave function describing this state will be e~@/")(2n+4En)t instead of 
the previous e7 @/M Ent. 

Over the period of time T during which the perturbing potential 
acts this relative difference in phase introduces the factor e7 0/5) AEn T. 
Expanding this factor to first order in time gives 1 — (i/h) AE, T. Thus 
we see that a first-order calculation of the energy shift in a state n due 
to a perturbation V is 


AE, = Van (6.113) 


This derivation of the first-order energy shift is not correct if the 
system is degenerate, i.e., if there are initially several states of exactly 
the same energy. It turns out that in such a case terms of second order 
in V give equally large effects. 

Adding in the second-order term in the perturbation expansion for 
the transition element gives 


et G/A)EnT yan = 1 — = Van (6.114) 


-\ 2 T pte . 
+ (-ż4) Sn, | f e (i/h)(Bj—En)(te—ta) dt dt, 
. 0 O 
J 


For the present let us assume that there is no degeneracy. Consider 
first the term j = n in the series which is the second-order term. 
The integral over this particular term is just T*/2. Integrals for the 
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terms 7 Æ n can also be performed easily to give the result 


eti/h)EnT A =] + Ven 2 VRT (6.115) 
Lås Valt |; , Lo expi-G/AE; — En)T} 
ñ — E; — En —(i/h)(B; =) 


jen 
The first three terms on the right-hand side of this equation repre- 
sent an expansion through second order of e~@/#¥nn7, The first of the 
summation terms, the one corresponding to the 1 in brackets, can be 
interpreted as a second-order energy change. That is, the incremental 
energy is not just Vnan, but contains higher-order corrections. Writing 
out the energy correction through second order in the perturbation en- 
ergy, we get 


AEn = Van — aint (6.116) 
jgn 7 < 


This last equation gives the correct expression, through second order, 
for the shift in energy of nondegenerate states. This result is much more 
easily obtained by conventional methods, i.e., by finding solutions of 


(H +V) = Ee (6.117) 


Furthermore, the conventional approach based on Eq. (6.117) permits 
simpler handling of degenerate states. However, it has been our purpose 
here to give an example of the use of transition amplitudes, rather than 
to give the simplest formulas for the computation of energy shifts. 

Actually, there are more complex problems involving energy shsifts 
in which the method of transition amplitudes is the simplest to apply. 
In such applications the scheme, as we have attempted to show above, 
is to identify terms in a series proportional to T, T?, etc. Then, if we 
remember that the amplitude to stay in the initial state 1s proportional 
to e (/MOS£nT ond that the series expansion is equivalent to a series 
expansion of this exponential, the correct expression for AE, can be 
written down. 

We have not yet discussed the last term in Eq. (6.115). If the states 
E; lie in a continuum, we must also define the character of the reciprocal 
in the sum of Eq. (6.116). If we take it to mean the principal value, just 
as we found when analyzing the problem in second order for m Æ n, this 
extra term can be shown to produce an effect proportional to T and to 
lead to an additional correction to Eq. (6.116) of 


A’ En, = ~in X Vni Vind (Ej — En) (6.118) 


J 
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But this cannot represent a further correction to the energy for it is 
purely imaginary, and the energy must be real. Let us call it —ihy/2 
(the A/2 is for convenience later) and write 


thy Vni V; 
N EN E E p E e | 
n=- =V, >, iE, k (6.119) 


This implies that the transition amplitude Ann to be in the nth state 
after a long time is proportional to 


exp or (AE, — ) r| = exp {-FAE.T} exp 1-5} 


The first factor is the energy shift. The second is easily interpreted; for 
the probability to be in state n after time T is [Ann]? = e777. It falls 
with time because at each instant there is a probability that a transition 
is made from n to some other state. That is, if all is consistent, y must 
be the total probability per second of a transition from n to any state in 
the continuum of the same energy. This it is, because from Eq. (6.118) 
our y is 


27 
y= = 9 Vin (E; — En) (6.120) 
J 


So we see that the total probability per second is just the sum of 
Eq. (6.87) over all possible final states as required (i.e., up to the re- 
quired order in V). 

The reciprocal of y is called the mean lifetime of the state. Strictly 
speaking, a state with a finite lifetime has no definite energy; the energy 
uncertainty by the Heisenberg relation is A/lifetime, or hy. 

If resonance experiments are performed to find the energy difference 
between two levels, each of which has a decay rate y, the resonance is not 
sharp but has a definite shape. The center of the resonance determines 
the energy difference, and the width of the resonance gives the sum of 
the y’s of each level. 


Transition Elements 


IN the preceding chapter we developed the concept of a perturbation 
treatment for changes of state in a quantum-mechanical system. We car- 
ried out an investigation of this method as it is applied to systems whose 
unperturbed hamiltonians are constant in time. In this chapter we shall 
continue the development of the perturbation concept and generalize 
the treatment to cover systems where the unperturbed state may have a 
hamiltonian varying with time. We shall introduce a more general type 
of notation and attempt to broaden and deepen our understanding of 
the ways in which changes of state take place in a quantum-mechanical 
system. The notation to be introduced applies to a type of function 
which will be defined in the first portion of this chapter. The function 
is called a transition element. 

The chapter is divided into four parts. The first part, consisting of 
Sec. 7-1, defines “transition amplitude” and “transition element,” with 
the help of examples based upon the perturbation theory of Chap. 6. The 
second part, consisting of Secs. 7-2 to 7-4, gives some interesting general 
relations among transition elements. The third part, consisting of Sec. 
7-5, shows the connection between transition elements defined with the 
help of path integrals and the treatment of quantum-mechanical transi- 
tions defined in terms of the more usual operator notation of quantum 
mechanics. In the last part, consisting of Sec. 7-6 and 7-7, the results 
learned in the preceding sections are applied to two interesting problems 
of quantum mechanics. 


DEFINITION OF THE TRANSTION ELEMENT 


The time development of a quantum-mechanical system can be pictured 
as follows. At an initial time ta the state is described by the wave 
function Y(Za, ta). At a later time tẹ the original state will develop into 
the state (Zp, ty). 

At this later time suppose we ask the question: What is the prob- 
ability of finding the system in the specific state X(x», tẹ)? We know 
from the general principles developed in Chap. 5 that the probability of 
finding the system in this specified state is proportional to the square of 
the amplitude defined by 


J AE E 


— 00 


We also know from Chap. 3 that the function ¢(£ẹ,tẹ) can be ex- 
pressed in terms of the original wave function with the help of the ker- 
nel K (£p, tb; Za, ta) describing the propagation of the system between 
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the times ta and tẹ. Thus, in determining the probability of finding the 
system in a specified state we can start with the original wave function 
w(2q,ta) and bridge the time gap with the propagation kernel K (b, a). 

The resulting amplitude, whose absolute square gives the probability 
desired, we shall call the transition amplitude, and we shall write it in 
the following notation: 


xla) = J [2 (22516) K 6,0) 0 (asta) dza dn 7.1) 


We wish to return to an even more basic description of the transi- 
tion phenomena, and we reintroduce the action S[zx(t)] describing the 
behavior of the system between the two time limits. Thus we write the 
transition amplitude as 


xiii = ff | P (xp, teis "Wag, ta) DE) dza dep (7.9) 


Here we have made the notation a bit more explicit by attaching the 
subscript S to the transition amplitude to indicate the action for which 
the integral was calculated. The path integral is to be taken over all 
paths that go from x, to zp» and the result of this path integral is multi- 
plied by the two wave functions, then integrated over the space variables 
at the two limits. 

Before proceeding further, we shall define the notation more com- 
pletely to cover a more general situation. We introduce the functional 
F\a(t)| without (for the present) describing its physical nature. With 
this functional we define a transition element as 


(XIE Ip) g= JI > C absti) Fla(t (EVE (ag, ta) Dza(t) dtq adxp (7.3) 


Here F is any functional of z(t) which does not involve x(t) at the end 
points £a or £, or beyond the end points. In the special case that F = 1, 
the integral of Eq. (7.3) is a transition amplitude. 

It is difficult to understand transition elements at an intuitive level. 
One approach toward such understanding involves a classical analogy. 
Picture a small particle moving with brownian motion. At some initial 
time ta the particle is at £a. We wish to determine the probability 
that the particle arrives at the point zp) at the time tẹ. For quantum- 
mechanical particles, we talk about starting from an initial state and 
arriving at some final state. Thus, the point £a for the brownian particle 
is analogous to the initial wave function w#(z,) in Eq. (7.2), and the 
point £ to X(x»). Furthermore, the solution of the quantum-mechanical 
problem requires integration over the variables za and zy of the initial 
and final states — a step unnecessary in our classical problem. 
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We would solve the classical problem by considering all possible paths 
for the particle’s motion. We would weigh each path with the function 
defining the probability that the particle actually follows such a path 
and then integrate the weighed contributions for all such paths. The 
weighing function is analogous to the term e’°/" appearing in the integral 
of Eq. (7.2). 

The final position in such a problem would not be a single point, 
but rather a small interval, x, to x, + dza. The result, when properly 
normalized, would be the distribution function P(x») giving the relative 
probability of arriving in the (differential) vicinity of z,. This function 
is analogous to the transition amplitude of Eq. (7.2) in the case that 
w(%q) and X(x») are Dirac delta functions of position. 

Now suppose we wish to know more about the motion than simply 
the relative probability to arrive at z. For example, we may wish to find 
the acceleration experienced by the particle at some particular instant, 
say 2 seconds after it starts. But now we need the weighed average of the 
acceleration, i.e., the acceleration for each possible path with each path 
weighted by the function defining the probability of the path. Such a 
weighted average is analogous to the transition element of Eq. (7.3). The 
property of interest, such as the acceleration at some time te, replaces 
the functional F'[z(t)] in the integral of Eq. (7.3). The classical problem 
could be solved by a path integral very similar in form to Eq. (7.3). 

In the remainder of this chapter we shall make use of this analogy, 
and we shall occasionally refer to transition elements as “weighted av- 
erages.” However, it must be kept in mind that the weighting function 
in quantum mechanics is a complex function. Thus the result is not an 
“average” in the ordinary sense. 

The path integral method of solving brownian-motion problems as 
described in this classical analogy is actually a very powerful method. 
It will be developed in detail in Sec. 12-6. For now, we attempt to 
further clarify the notion of a transition element with the help of the 
perturbation theory developed in Chap. 6. 


Perturbations. Suppose the action describing the development of 
the system can be separated into two parts, so that S = Sọ +o. We 
suppose that the first part So leads to simple path integrals, whereas 
the remaining part o is small enough that we can apply a perturbation 
scheme. We write the exponential function of Eq. (7.2) as 


erS/h = eSo0/h pia /h (7.4) 
Using Eq. (7.3), the transition element of Eq. (7.2) becomes 
(XL) soo = (Xle lbs, (7.5) 
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The exponential function can be expanded to give 


(XID) spo = Xll) + Flo) 4 = 3 


This expansion is a generalized version of Eq. (6.3) and forms the 
basis of the perturbation theory. The transition elements which arise in 
most quantum-mechanical problems result from this expansion. 

Suppose the perturbation action o results from a perturbation po- 
tential, so that 


(Xlo*|h)s, ++ (7.6) 


o= — [ V (a(t), t) dt (7.7) 


Then the first-order perturbation is given by the transition element 


aolha, == f XIVE stl) sy di (7.8) 


a 


To evaluate this element, we need to solve the integral 


XIV [a(t =| / [x (a(t), t]e®S/" oh (224) Da(t) dag devy 


(7.9) 


The first step in the solution of this integral is the same as the solu- 
tion for the perturbation kernel K“) described in Eqs. (6.8) to (6.11). 
This solution for the path integral is followed by integration over both 
end points, £a and Zp, as well as an integral over the midpoint ge. That 
is, 


(X|V [a(t a= | ffx (tp) Ko(b, c)V (ce) Ko(c, a)y (£a) dt, dx, day 
(7.10) 


We have now arrived at an expression which combines three concepts 
previously introduced. First, we have made use of the propagation rule 
for a wave function as defined in Eq. (3.42). Next, we have made use of 
the amplitude function as defined in Eq. (5.31), which gives the ampli- 
tude that a system known to be in one state will be found in another 
state. Lastly, we have made use of the first-order perturbation theory 
given in Eq. (6.11) for the kernel describing the propagation in time. All 
of these ideas combined give the transition element of Eq. (7.10). The 
absolute square of this element is the probability that a system starting 
in state w and acted upon by the small potential V (x,t) will be found 
at a later time in state X (if state X would not be reached for V = 0, 
that is, if (X|) s, = 0). 
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We can use Eq. (3.42) to shorten our notation, in the same way that 
the notation of Eq. (6.23) was shortened into the form of Eq. (6.25). We 
define 


(UA ere =f Kol (Ca) (£a) dia (7.11) 


which is the wave function that would result at time te from the initial 
wave function if there were no perturbation. In a similar way we define 


OO 
X* (£e, te) = J X* (ep) Ko(b, c) dng (7.12) 
— OO 

as the complex conjugate of the wave function which, at t = te, would 
result in the function X(x») at time tẹ if there were no perturbation. 
(See Eq. (4.38) and the following discussion, including Prob. 4-7.) 

In terms of these new wave functions, the first-order term in the 
perturbation expansion can be simplified to read 


(x [ve Osta w) = [fixe (c)W(ae) de, dt, (7.13) 


We see here that the transition amplitude written in this form is a 
generalization of the transition amplitude A,,, which was introduced in 
Sec. 6-5. If the wave functions on the right-hand side of Eq. (7.13) are 
ned then the resulting transition amplitude is identical with 

Amn, as defined by Eq. (6.70). 

Thus the evaluation of a transition element of a functional F'x(t)], 
which depends only on x at a particular time t (that is, an ordinary 
function of x(t)), or of a time integral of such a functional presents no 
problem. The evaluation of a transition element for functionals involving 
the values of x at two separate times is also easy. ‘This occurs, for 
example, in the second-order perturbation term. This can be written as 





sels, = sr ff VEDAV) dds (T14 


The integrand of this last equation is itself a transition element, and 
it is written as 


(XIV [x(t), V le(s), sll) = J [ro (c)Ko(c,d)V (d)b(d) dara dr, 
(7.15) 


where we have substituted tj = s and te = t if s < t or ta = t and te = 8 
if t< s. 
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Thus the second-order term in the perturbation expansion becomes 


i (xf 
oh? 


tp 


V (x(t), Nal, We ds 








o). z (7.16) 


TA a J fo (c)Ko(c, d)V (d)w(d) dza dta dze dte 


This can be recognized as a generalization of the transition amplitude 
defined in Eq. (6.74). Expressions involving three or more functionals 
are also readily written down. 

Equation (7.6) corresponds also to a more general type of perturba- 
tion theory. For example, consider the case of the particle interacting 
with an oscillator. After integrals have been carried out over the coor- 
dinates describing the oscillator, the resulting action can be written as 
So + g, where (see Sec. 3-10) 


oe ty | . . 
ay TR J J. g(a(t), t)g(a(s), 8) sinw(ty — t) sinw(s — ta) ds dt 
(7.17) 


with g(x(t),t) characterizing the interaction of the particle and oscilla- 
tor, and T = t, — tg. 

We have noted that path integrals involving such compia actions 
are very hard to evaluate indeed; but if the effect of the complicated term 
o is expected to be small, we can obtain useful results with less effort 
with the help of the perturbation expansion of Eq. (7.6). To illustrate, 
we find the first-order term in such an expansion (i.e., the first Born 
approximation). Using Eq. (7.17) for o, we must evaluate the term 
(i/h)(X|o|w)g,. This term can be written as 


1 —4 
p Mol) so -= AMusinwT 


f Ji (X|g[x( t}g|x (s i s||w) 5, sin w(t, — t) sin w(s = ta) ds dt 


(7.18) 


so that the difficult part of the problem is reduced to anding 


(xlglet), tlgle(s), slw) s 


But this we have already done in Eq. (7.15), except that g replaces 
V. Therefore, we write 


(xlglæ(t), tlgle(s) _ (7.19) 
=Í] a E A EE O A 
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This expression can be substituted into Eq. (7.18) to obtain the final 
result for the first Born approximation, (1/h)(X|o|W) s, 

Transition elements will come up more frequently in succeeding chap- 
ters. In each example they can be evaluated in the straightforward man- 
ner which we have illustrated here. For that reason, very little of the 
material in the remainder of this chapter is really essential to the work 
that follows. Nevertheless, there are two reasons for the inclusion of 
this material in this book. First, it is possible to obtain a very general 
relation between transition elements. This relation might well serve as 
an alternative starting point for the foundations of quantum mechanics. 
Second, for many people already familiar with the more conventional op- 
erator notation of quantum mechanics, it is helpful to have examples of 
the translation from the more customary representation into that which 
is used in this book, such as expressions in the form of Eq. (7.3). 

With the rules for translation available, the subject matter of the 
later chapters, developed as it is from the path integral approach, can 
be appreciated in terms of more familiar symbolic concepts. 

The relations discussed in the remainder of this chapter are indepen- 
dent of the form of the wave functions which describe either the initial 
or final state of the system, and which are used in defining the transition 
element. For this reason we shall abbreviate our notation by omitting 
any specific reference to these wave functions. Thus a transition element 
will be written as (F) instead of (X|F'|W) ¢. 


FUNCTIONAL DERIVATIVES 


We are embarking on a mathematical development which leads to an 
interesting relation between transition elements. This relation finds its 
most elegant expression in terms of a mathematical idea, the functional 
derivative. Since this idea may not be familiar, we describe it in this 
section. 

The functional F'|x(t)| gives a number for each function x(t) that 
we may choose. We may ask: How much does this number change if 
we make a very small change in the argument function z(t)? Thus, for 
small n(t), how much is F'a(t) + n(t)| — Fla(t)|? The effect to first 
order in 7 (assuming it exists, etc.) is some linear expression in 77, say, 
f K(s)n(s)ds. Then K(s) is called the functional derivative of F[x(t)] 
with respect to variation of the function x(t) at s. It is written 6F'/dxz(s). 
That is, to first order, 


Plz +n] = Fla] + f C) ds +: (7.20) 
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This 6f'/dx(s) depends on the function x(t), of course, and also on the 
value of s. Thus it is a functional of x(t) and a function of time s. 

We may look at it another way. Suppose time is divided into very 
many steps of small interval e, the values of the time being t; (tj) = 
ti +€). The function x(t) can now be specified approximately by giving 
the value x; that it takes on at each of the times ¢t;. The functional 
F'|x(t)| now is a number depending on all the x;; that is, it becomes an 
ordinary function of the variables x;, 


F x(t) — ae 2 eee eo, ee (7.21) 


Now we can consider OF'/0x;, the derivative of F with respect to just one 
of these several variables. Our functional derivative is just this partial 
derivative, taken at the point t; = s, and then divided by e. That is 


OF 1OF 


ar — arom 
6x(s) c€ Ou; 

This we can see as follows. If we alter the path from z(t) to x(t)+7(¢), 
we change all the x; from gz; to x; +m (where 7; = n(t;)), so that the 
first-order change in our function is 





(7.22) 


s OF 
Peys tr ee eee) AE Co a eg ee Oy 
. YA 
% 


(7.23) 


from the ordinary rules of partial differentiation. If now we call 
(1/e)(OF /Ox;) = K;, the last sum is >), Kime, which in the limit be- 
comes f| K(s)n(s) ds. So if this limit exists as e — 0, then it is equal to 
6F'/dz(s). 


One can also use the ideas of differentials. Just as we can write 
Of 
—~\\— da, 
T5 20; 
so we can write for the first variation of any functional 
OF 
ôF = | —— ô d 7.24 
J TO xz(s) ds (7.24) 
where 6x(s) is the differential change in path at 2x(s). 


Problem 7-1 If S|x(t)| = tea) dt, show that, for any s 
inside the range ta to tp, 

5S __d (ƏL) ðL 

ôx(s) ds \ 0x Ox 


— (7.25) 
where the partial derivatives are evaluated at t = s. 
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Problem 7-2 If F|a(t)| = x(t), show that 


ôF 


—— = (t — 7.26 
ag = Ott 9) (7.26) 
Problem 7-3 If 


FUj(x,t 


,t)| = 
exp 1 J|] IEE ta) Reve = rı, lo = t1) d?r dto dr, dts} 


where the integrals extend over all space and all time, show that 
F |] ie t)+[R(r — x,t — s) + R(x—r,s—t)|d°rdt (7.27) 


Note that the function j(r,t) is a function of the four variables 
(Tæ,Ty,Tz,t). Thus the single coordinate s, as used in Eq. (7.24), for 
example, must be replaced by the set of coordinates (x,y, 2,8) in speci- 
fying the point at which the functional derivative is evaluated. 


aa 


The general relation between functionals which we mentioned at the 
end of the preceding section may be obtained by trying to develop a 
formula for the transition element of dF/déx(s). This we can do most 
easily in this way. Consider, using an abbreviated notation, 


(ge J Fiete SEO Da(t) (7.28) 


Now in the integral over paths substitute z(t) + y(t) for the variable 
z(t). For fixed n(t), D[x(t) + n(t)] = Da(t), because dla; + m] = dazı. 
But the integral is unchanged by a substitution of its variable. Hence 


(F)s = | Fle(t) + (Ele PSOH Daçi) (7.29) 


= | Feeyesmseo Dat f E) is e(t/h)S|x(t)] Da(t) 


T ; [Few S ds ee EAC aby nga ee 


expanding the exponential and displaying only to first order. The zero- 
order term is exactly (fF). again, so the remaining terms must all vanish. 
In particular, the first-order term must vanish for any 7(s), so that we 
conclude the relation 


(aa) 5 "AA", a0 


This general relation has many important consequences. 
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It would be possible to use Eq. (7.30) as a starting point to define the 
laws of quantum mechanics. One could work backwards to reproduce, 
for example, Eq. (7.6). If some generalization of quantum mechanics 
is desired, one might suppose such a generalization is included in the 
action S' appearing in the term e*°/”, or perhaps start with a form like 
Eq. (7.30) and introduce modifications with the help of the differential 
notation. Julian Schwinger has been investigating the formulation of 
quantum mechanics suggested by Eq. (7.30). 

We can see how the relation of Eq. (7.30) comes about in another 
way by imagining our time split into intervals € and functionals replaced 
by functions of the points x; corresponding to t;. Then consider the 
path integral 


J OF ahs) Da(t) (7.31) 
OX p 


where tą is some intermediate time not at either end point. The path 
integral is simply an integral over all the points x;. So we integrate by 
parts to get 


OF ;, i ƏS 
OF ASEO Drt | p eG selt) 
J n w(t) == JF T De(t) (7.32) 


dropping the integrated part. 
Problem 7-4 Discuss why the integrated part vanishes. 


The result is 


OF TA 
eE e e 00 
3 h ( E. an 


which has the same content as Eq. (7.30). 
It is better to write these relations as differentials, 
(OF) = -+ (F 58) (7.34) 


for then the specific variables on which F and S depend need not be 
indicated 


Problem 7-5 Argue that Eq. (7.34) may be misleading, for Eq. (7.33) 
applies only to rectangular coordinates. Do this by studying the corre- 
sponding relation where spherical coordinates, for example, are used and 
we wish to find (OF /Or;) s. 
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TRANSITION ELEMENTS OF SOME SPECIAL FUNCTIONALS 


The relation of Eq. (7.34) has many interesting implications. In this 
section we shall investigate some of them. We shall take the special case 
of a one-dimensional particle moving in a potential V(x). 

Suppose the action over the path of the particle is given by 


= A (72 — V(a(t))| dt (7.35) 


Upon application of the small variation ĝæ(t) to each path there results 
(to first order) 


to 
iS = -J [më + V' (x)| da(t) dt (7.36) 
ta 
Using Eq. (7.34), we have 


, ty 
(OF) = : (r | Ima + V’ (x)| x(t) it) (7.37) 
ta 

Alternatively, we could return to the point of view used in developing 
Eq. (7.33). That is, we imagine time divided into small slices of length 
e. In this case the action S can be written as 


S= > [3 CT _y(a)e (7.38) 


If we select a particular time tg and, as before, let x, be the associated 
position of a path, then 


OS E Tk+1 — Tk Uk — tk—1 / 
oa = -m ( ; ; —V (LE JE (7.39) 


Upon application of Eq. (7.33) there results 


OF \ i Tee = 2 eS j 
E-a pv) ea 


In this last expression the factor involving an e? in the denominator is 
actually the acceleration ë evaluated at the time tk. Thus Eq. (7.40) 
is just a special example of Eq. (7.37). In particular, it corresponds to 
Eq. (7.37) if x(t) is zero for all t Æ tg. If dx(t) is assigned the value 
c- dx, - O0(t — tk), then Eq. (7.40) results. Actually Eq. (7.40), since it 
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holds for all k, is completely equivalent to Eq. (7.37) in a more detailed 
notation. | 

In Eq. (7.37) suppose we choose the special functional F = 1. Then 
ÒF = 0 and we have 


- ( fimi + V' (x)| x(t) it) =|) (TAL) 
Since this result holds for any arbitrary choice of dz(t), it must be that 
(miy =a) (7.42) 


at all values of time. This is the quantum-mechanical analogue of New- 
ton’s law. Making use of the classical analogue for a transition element, 
described in Sec. 7-1, this result says that the weighted “average” of the 
mass times acceleration at any time (where “averaged” means “averaged 
over all paths with the weight e*°/ ñ») is equal to the weighted “average” 
of the force (negative gradient of the potential) at the same time. 

As another example, suppose F is some arbitrary nonzero functional 
of all position variables except x,. Then the left-hand side of Eq. (7.40) 
is zero (since OF /Ox;, = 0) and there results 


£ — 2¢, + Lk 
(Fle .- -3 Čk—1; Čk+1;-- N] C maa +v) = 0 
(7.43) 


This equation says that the transition element of më + V'(x), averaged 
over all paths, is zero at tę even if these paths are weighted with an 
arbitrary functional, so long as the functional is independent of the 
position of the path at the time tẹ of interest. 

Suppose, however, the functional does depend upon the position of 
the path at the moment of interest. In particular, suppose simply that 
the functional F is z,. Applying Eq. (7.40), we have 


„11 — 2 ae 
(1) = Lae map Tk+1 — 60k t Tk-1 + £k V' (ap) 
ħ E2 
= - (ma Gana SH ga et) i enV” (a1) ) (7.44) 
€ 


If we suppose that the potential V (æ) is a smooth function, then in the 
limit as € — 0 we find that ex,V’(x;,) becomes negligible in comparison 
with the remaining terms. The result is 


pa a E 
(m En) _ (nym | oc Bigs (7.45) 
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This last equation involves the product of position variables x and 
momentum variables mg. In the first term the momentum is evaluated 
first as a linear average corresponding to the time tk + €/2, and the 
position is taken at tg. In the second term the position is again taken 
at tk, but the momentum corresponds to the time t, — €/2. Thus this 
equation says that the transition element of a product of position and 
momentum depends upon the order in time of these two quantities. 

Later on, when we make a translation into the more usual operator 
notation, we shall see (Sec. 7-5) that both the operator equation of 
motion, corresponding to Eq. (7.42), and the operator commutation laws 
of Eq. (7.45) have been derived from the same fundamental relation, 
Eq. (7.34). | 

We can derive a further result from Eq. (7.45) which will give us a 
better idea of the characteristics of the paths which are important in 
quantum mechanics. Consider the two terms 


(zume | (7.46) 
€ 
and 


(arm EHEN (7.47) 


These two terms differ from each other only in order e, since they are 
the same quantity calculated at two times differing by the interval e. 
Thus we are justified in substituting Eq. (7.47) for the second term in 
Eq. (7.45). The result is 


(m= 78 (oy — 241) = Fl) (7.48) 


1 


Alternatively, we can write this as 


(===) = -7q (7.49) 


This equation says that the transition element of the square of the ve- 
locity is of the order 1/¢, and thus becomes infinite as € approaches zero. 
This result implies that the important paths for a quantum-mechanical 
particle are not those which have a definite slope (or velocity) every- 
where, but are instead quite irregular on a very fine scale, as suggested 
by the sketch of Fig. 7-1. In fact, these irregularities are such that 
the “average” square velocity does not exist, where we have used the 
classical analogue in referring to an “average.” 
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If some average velocity is defined for a short time interval At, as, 
for example, [z(t + At) — x(t)]/At, the “mean” square value of this is 
—h/(im At). That is, the “mean” square value of a velocity averaged 
over a short time interval is finite, but its value becomes larger as the 
interval becomes shorter. 

It appears that quantum-mechanical paths are very irregular. How- 
ever, these irregularities average out over a reasonable length of time 
to produce a reasonable drift, or “average” velocity, although for short 
intervals of time the “average” value of the velocity is very high. 








Mes 
Fig. 7-1 Typical paths of a quantum-mechanical particle are highly irregular 
on a fine scale, as shown in the sketch. ‘Thus, although a mean velocity can 
be defined, no mean-square velocity exists at any point. In other words, the 
paths are nondifferentiable. 
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Problem 7-6 Show, for a particle moving in three-dimensional 
space g, Y, Z, 


he 


(ter — &e)*) = (yeti — Ye)*) = (eee — 2e)*) = = (7.50) 
((Ze+1 — Tk)(Yk+1 — Yk)) = (Let — Tk)(Zk+1 — Zk)) 
= ((Yk+1 = Ur kri = Zk) =0 (7.51) 


It will not do to write the transition element of the kinetic energy 
simply as 


e) vs 


for this quantity becomes infinite as € approaches zero. How shall we 
find an appropriate expression to represent the kinetic energy? We might 
make the heuristic guess that only those functionals F which might ap- 
pear in some kind of a physical perturbation problem may be of impor- 
tance. How can we get the kinetic energy through a perturbation? If 
the mass of the particle were perturbed by a factor 1+ 7 (with 7 very 
small) for some short interval of time At, the action would be perturbed 
by n At(m/2)z*, which is proportional to the kinetic energy. We are led 
to ask: What would be the form of the first-order perturbation (co) ¢_ if 
m were changed to m(1 + n) for a short time? 

For simplicity we can take the short time to be just €, the step used 
to define the time spacing, so that the first-order term divided by e7 is 
the kinetic energy. The perturbation in S of Eq. (7.38) (if the m in the 
i = k term is changed to m+ nm) is clearly en(m/2)(rp41 — vp)*/€?. 
But this is not the only change in the path integral if m changes. The 
normalizing factors 1/A for each m vary as m!/?, so a factor (1 + 7/2) 
is introduced from this. Hence the entire first-order change in the path 
integral when m is so changed becomes, after dividing by ne, 


; 2 

IM fia ay h 

A Go a — 7.03 
h 2 ( € ) E Z) oo 


which should be satisfactory for 7/h times the kinetic energy. 

Using Eq. (7.49), one might expect this to vanish; but Eq. (7.49) is 
valid only as e — 0 to the order 1/e. The quantity in Eq. (7.53) is, in 
fact, finite as € — 0. The expression can be rewritten by expanding the 
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quadratic term. In Eq. (7.40) let F be £k+1 — £k. If terms of lowest 
order in € are kept, the result is 


mM ð k4+1 — Uk Tk — Le-1 = mM ð Tk+1 — Lk ° i Lop 
2 € € 2 € 21E 
(7.54) 


Thus we can define the left-hand side of Eq. (7.54) as the transition 
element of the kinetic energy. 

We see from this result that the easiest way to produce an 
transition elements involving powers of the velocities is to replace these 
powers by a product of velocities, each factor of which is taken at a 
slightly different time. 

In simple problems the transition elements can sometimes be evalu- 
ated directly. For such problems the same results can also be obtained 
by using the relations among transition elements which we derived in 
Sec. 7-2. These relations may supply us with soluble differential equa- 
tions for the transition elements. We shall give a few illustrations, but 
it will be readily seen that the examples for which the method works 
must be so simple that a direct evaluation would not really be much 
more difficult. 

For our first example, consider the case of a free particle going from 
Lq tO £ in the total time interval T. Let us find the transition element of 
the position at the time t, that is, z(t). Of course, this is some function 
of t and it is clear that 


(@(0)) =@a(1)  (æ(T)) = za(1) (7.55) 


Since any potentials acting on the particle are constant in space (i.e., no 
forces act), the second derivative of the transition element of position is 
zero in accordance with Eq. (7.42). Thus an integration gives 


(o(t)) = ta + (T — a) (1) (7.56) 


Note that the expression in the brackets is just the value of x(t) along 
the classical path Z(t). 


Problem 7-7 Show that for any quadratic action 
(a(t)) = &(¢)(1) (7.57) 


As a somewhat less trivial example, let us try to evaluate the tran- 
sition element (x(t)x(s)) for the same free-particle conditions. Since 
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this is a function of two times, we can write it as g(t,s). The second 

derivative with respect to t is 

0 g(t, s ' 

OHS) = (a(e)(0) (7.58) 
This transition element can be worked out by substituting F = x(s) 

into Eq. (7.40). For s Æ t, following the arguments leading to Eq. (7.42), 

the result is —(1/m)(V’(a(t))a(s)); while for s = t, following the ar- 

guments leading to Eq. (7.44), we find that the transition element of 

Eq. (7.58) is of order 1/e. In the limit of small e we have 


AH) — aael) = ot- 0- EVEA) (759) 


Since for our free particle the potential is independent of position, the 
second term on the right of Eq. (7.59) vanishes. The resulting equation 
may be solved by dividing the region of interest into two parts. For t < s 


g(t, s) = a(s)t + b(s) (7.60) 
while for t > s 
g(t, s) = A(s)t + B(s) (7.61) 


Thus the first derivative of the function g with respect to t jumps by 
the quantity A(s) — a(s) as t goes from just below to just above s, and 
in accordance with Eq. (7.59), A(s) — a(s) = (h/im)(1). 

The boundary conditions state that 


(a(0)x(s)) = fa(x(s)) = La(s) (1) 
(a(T)x(s)) = xpa(s)(1) (7.62) 


This is not enough information to determine all of the four functions 
a(s), b(s), A(s), and B(s), but we can either make use of the relation 
07g À 

— = —d(s—#)(1 7.63 
obtained by differentiating g(t,s) with respect to s, or else notice that 
g(t, s) must be symmetric in ¢ and s. One can conclude that the func- 
tions a(s), b(s), A(s), and B(s) must all be linear in s. The boundary 
conditions are now sufficient to determine the solution. The result is 


20206) + od (1) for t < s 
(a(t)x(s)) = (7.64) 
EOLO + ZTP (1) fort > s 
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That this result is right can be verified by inspection. The product 
of two classical paths taken at different times, Z(t)Z(s), is the solution 
of the homogeneous equations obtained by setting the right-hand sides 
of Eqs. (7.59) and (7.63) equal to zero, which satisfies the necessary 
boundary conditions. The last terms on the right of the pair of equations 
(7.64) are the special solutions of the inhomogeneous equations (7.59) 
and (7.63), which are zero at the end points. 

The transition element of the product of two positions taken at two 
different times contains more than just the product of the two corre- 
sponding positions along the classical path. There is a small additional 
term which is purely quantum-mechanical in nature. This additional 
term is consistent with our picture of quantum-mechanical motion. Even 
though the particle moving between fixed end points will be found on the 
average along the classical path, it has an amplitude for motion along 
all alternative paths. This fact must be remembered when considering 
the transition element of the product of positions at two different times. 
All the possible positions among all the various alternatives must be ac- 
counted for in the transition element, and this accounting introduces the 
extra term. Only at the specified end points are no other alternatives 
possible. 

We can better understand the significance of this result if we make 
use once again of the terminology from our classical analogue. Suppose 
the path of the particle goes through a particularly large value of z at 
some time s. Then the “average” value of x at a later time t is not just 
the ordinary average Z(t). There is a correlation with the previous large 
deflection. Therefore, the “average” product is not just the product of 
“averages.” | 

In this and other applications of the classical analogue, we remember 
that the “average” referred to is defined with the help of the weighting 
function et9/#, This weighting function is not positive definite, and 
is in fact complex. Thus we develop such purely quantum-mechanical 
results as that of Eqs. (7.64), wherein the extra correlation term is pure 
imaginary! | 


Problem 7-8 Find the transition element (#(t)x(s)) = g(t, s) when 
the potential is not constant but, rather, corresponds to that of a forced 
harmonic oscillator. Do this by obtaining differential equations for g(t, s) 
and trying the solution 


(x(t)x(s)) = g(t, s) = [2(t)2(s) + G(t, 8)](1) (7.65) 


Obtain an equation for G(t,s) showing that G is independent of the 
end-point values x, and zp and of the forcing function f(t). Show in 
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general that, with 7’ = ty — ta, 
A si T — 
ai wt A s) ee 
am w sin wT 


Gis) = | (7.66) 
h sinws sinw(T — t) re 
im w sin wT 


GENERAL RESULTS FOR QUADRATIC ACTIONS 


Evidently if the action S is a quadratic form, transition elements of many 
functionals can be determined readily. This suggests that we extend 
our consideration into a somewhat more general class of functionals. 
The technique to be used is the same as that described in Sec. 3-5. 
For example, we note that with a quadratic action S we can easily 
evaluate the transition element of exp{(i/h) f f(t)x(t) dt}, where f(t) 
is any arbitrary function of time. ‘The transition element of such a 
functional can be written as 


Ce TROLO r = (7.67) 


Th l 
[Lf aet PESO yaa) Dali) dza dar 


If the original action S is gaussian, then so is the action 
tp 

S=S+ ftat) dt 

ta 
Thus the path integrals on the right of Eq. (7.67) can be carried out by 
the methods of Sec. 3-5. If S’, is the extremum of the action S", then 
the factor exp{tS7,/h} can be extracted as a factor for the path integral 
of Eq. (7.67). The remaining factor is a path integral over the paths 
y(t), which run from zero to zero during the allowed time interval. (We 
set x(t) = z(t) + y(t), where Z(t) is the classical path corresponding to 
the extremum of the action.) 

The integral over the paths y(t) does not depend upon the function 
f(t), since this function appears in the action S’ multiplying only a 
linear term in x(t), and we have seen (Eq. 3.49) that the remaining path 
integral involves only the quadratic parts of S” which are not more than 
the quadratic parts of S. This means that the path integral on the 
right-hand side of Eq. (7.67) can be reduced to an exponential function 
multiplied by the transition element (1). The result is 


(afi 7 FOH dt) = (1) exp filu = Su) | (7.68) 
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Once the extremum S^, has been evaluated, the extremum Sa can be 
obtained from it by setting f(t) identically equal to zero. The action of 
the forced harmonic oscillator, described by Eq. (3.66), is a special case 
of this action 6%. 


Problem 7-9 Use this result to show that if S corresponds to a 
harmonic oscillator 


ty 
S = — [2 — wx?) dt 
2 Jt, 


then 


(exp E j HOLO it} ) = (1) exp hanut t) 


20 bb g 20s ‘ 
— t t —t,) dt t ta — t) dt 
x 5 _ FOsinwt- ta) dt + 2 f FO sinolto ~ t) 


[ i f(t) f(s) sinw(ty — t) sinw(s — ta) ds it 





mew? 
where ZX, and 2p are the initial and final coordinates of the oscillator. 


From the transition element given by Eq. (7.68) we can obtain the 
transition element of x(t) itself by another method. Suppose we differ- 
entiate Eq. (7.68) with respect to f(t). The result is 


(aeli f ooa) = mirige iSu- sa) 


= (1) 58 ex {5 (Sa — Sa) } (7.69) 


Therefore, by evaluating both sides when f(t) = 0, we obtain 














— 4) SSe 
(oO) = 0 Fe] (7.70) 
We can continue this process to get the second derivative as 
h\? g i 
eat) = (7) yoyo OP {Se Se) fl 
= he 075); 65%, OS.) | 
o E700 t FOTO jo cen 


Actually, since S’, is quadratic only in f (see Eq. 3.66), the transition el- 
ement of a factor of any number of x’s can be directly evaluated in terms 
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of 6S9”,/d5f(t) and the quantity 6°S”,/df(t) f(s), which is independent 
of f. This explains the form of Eqs. (7.64) and (7.65) and permits the 
transition element of a factor of three x’s to be written down. 


Problem 7-10 Show, for any quadratic functional, if we write 


(x(t)) = Z(t) (1) and (a(t)a(s)) = [z(t)%(s) + G(t, s)](1), that 


(2(t)a(s)x(u)) = |#(t)@(s)2(u) 
+ £(t)G(s, u) + &(s)G(t, u) + £(u)G(t, s)](1) 


Find the transition element of the product of four x’s. (Suggestion: 
Since S% — Se is quadratic in f and zero for f = 0, it must have the 
mathematical form Sy — Sa = 3 JJ f(t)f(s)G(t, s) dtds + f z(t) f(t) dt, 


where G and Z are some functions. | 


TRANSITION ELEMENTS AND THE OPERATOR NOTATION 


In this and the following sections we shall see how transition elements 
look in the conventional notation of wave functions and operators. This 
will help the reader who is familiar with that form of expression to relate 
the results of path integral calculations to other results that he already 
knows. 

If F is a functional only of x at a single time, say, the function 
V (zp) at time t,, we know from Eq. (7.10) how to evaluate its transition 
element. Similarly, if F depends on the value of x(t) at two different 
times, Eq. (7.15) tells us what to do. 

Let us consider next the case that F represents the momentum at 
time t and make use of the approximation that the time axis is cut up 
into slices of length e. Thus 


| aay ea es (7:72) 
€ 
Then we have 


(xm 


The right-hand side of Eq. (7.73) can be written as 


Lk+1 — Lk 








b) = T (Aenne) lelt) (7.73) 
S 


m 


= | X* (x, t+ €)xw(x,t +e) dx — J X (ahau tdr (7.74) 
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Now making use of the wave — 


Vette) = V0.1) +e = Sup (7.75) 
X*(a,t +e) = X* (x,t) + Be = X* + pie (7.76) 


from Prob. 4-3, where H is the hamiltonian belonging to the S. There- 
fore, to first order in € 


|e t+e)aw(x2,t+e€)dr = [ow t)aw(ax,t) dx Gage 
= = J X* (ax, tha|Hw(a, t)| dx = JEX, D] 20t) dx 

By Eq. (4.30) this last integral can be written as f X* (x, t)[Haw(z, t)] dx 

or more simply we have 


(Xlma|v) = a X* (xH — Hry dx (7.78) 


h 


using the operator notation. This is the same as 


im h Ob a h Ou) | 
= a =e a= [x py da (7.79) 


where we have used the result of Prob. 4-4. The operator (fA/2)0/0z is 
called the momentum operator or, more specifically, the operator repre- 
senting momentum in the x direction. We already see why. Construct- 
ing the transition element of mz is equivalent to putting the operator 
(h/i)0/Ox between X* and 4, just as constructing the transition element 
of x is equivalent to putting x between X* and w. These relations can be 
understood, perhaps with greater clarity, if we go over to the momentum 
representation. If 


X(p) =| x(x)e 2/2 dg 





= / Wla)etP/) dy (7.80) 
are the momentum representations of X and w, one can show that 
= ñ Ow(ax) | dp 
*(x)— di = x — . 
| LOE de= | rome x, (7.81) 


Problem 7-11 Show this. 


Another way to see this relation is the following. Consider the tran- 
sition amplitude given by 


Le = J. f x (pasty) Ktw tata) Utes) dLa dx; (7.82) 
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Now suppose the £a origin is shifted left by a small amount A. Calling 
the new variable z^, we have 


tq = al,—A | (7.83) 


Using this new variable rather than the old x, will not alter the transi- 
tion amplitude of Eq. (7.82). It becomes 


wm= f° J [cas 


i 
x exp gs J Sleisr, tizi; ti ti] + Slez tzita — Astal] 
4=2 


x p(x, — A, ta) Dx(t) dx dxp (7.84) 


where the path integral for the kernel has been written out explicitly, 
using the methods of Eq. (2.22). 

Next, we expand S|z2, t2; 7, -A, ta] and y(x A, ta) in Taylor series 
and keep only the first-order terms. The exponential function becomes 


i N-1 ; 
l 
exp p ` S| Bat i4+1; Tit i le Slo tat 


1=2 
OS|x2, t2; £ gal] 
Ox', 
We may drop the brime notation in the integral defining the transition 


amplitude, since x, is a variable of integration. The form of Eq. (7.84) 
now becomes 


(xil) = J| xox E E A taf fx) K(b,a) (7.86) 


OSIE tos Lasta]! h OW (La, ta) 
x Stata A i Of, dLa dLa 


x f — -A (7.85) 


where we retain the notation that point x2 is spaced along the path z(t) 
only by the short time interval e from the point £a = 71 and te = ta + €. 

The first term on the right of Eq. (7.86) is identical to the transition 
amplitude on the left. This means that the remaining term must be zero. 
But this remaining term is a combination of two transition elements. 


Thus 
OS|x2, ta + €; La, tal B h OY (Ta, ta) 
(x ILa i 7 (x i Ta ver 


In the convention of Eq. (2.22) we use the classical action along each 
of the short segments of the path. Thus the action S[b, a] appearing in 


ii 
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Eq. (7.87) is the classical action for the initial path element. Its negative 
derivative with respect to x, is the classical definition of the momentum 
at £a (see Eq. 2.11). So we can write 


apelo = (xh =) 7.88) 


which is the same result as that obtained in Eqs. (7.78) and (7.79). 

Sometimes working with a complicated S that results perhaps from 
the partial elimination of interacting parts, we would like to identify 
the functional p(t) which corresponds to the momentum at time t. The 
work of the preceding paragraph suggests a general definition. The 
first-order change in the transition amplitude (X|1]w), if all coordinates 
corresponding to times previous to t are shifted by —A, is this A times 
(X|p(t) |). From this principle the momentum functional may be found 
for an arbitrarily complicated S. In a like manner, the hamiltonian or 
energy functional can be defined by shifting the time variables, as we 
shall describe in Sec. 7-7. 


1 











Problem 7-12 Show, if g is any function of position only, that 
d z 
(x dg v) Z (x g(Tk+1) — G(r) Y) 
dt € 


OO 
= E J X*(gH — Hg) dz (7.89) 


Consider the case that g is a function of the time as well. Show that the 


transition element of dg/dt is equivalent to the transition element of the 
operator —(i/h)(Hg — gH) + 8g/ðt. 


Problem 7-13 Show that 
(xlmily) => | X (PH — Hp) da (7.90) 


and argue for any quantity A, given in terms of an operator or otherwise, 


that dA/dt is equivalent to —(i/h)(AH — HA) + 0A/0dt. 











Next we consider an expression F involving two quantities evaluated 
in rapid succession, such as 


F= mE Eny (7.91) 
This evidently gives 
(X|F lb) = =f my (z,t+e)aK(2,t+ey,t)yv(y,t) dy dx 

= iE (x, t)x y(x, t) dx (7.92) 


=O 
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where t = ty. In developing Eq. (4.12) from Eq. (4.2) we saw 
CO 

| K@t+ aut) fay = te- FHF (7.93) 
CO 


so that the first integral in Eq. (7.92) is 


af x (x,t +e)x z (1 - <H) ty) (x,t) dx (7.94) 


Expressing X* by Eq. (7.76) and using the hermitian property of H, we 
find that this integral is 


zf X* (2, t) £ n cH) a C z a rw(a, t) da (7.95) 
-2f x xara jart [™ x (Ha — xH)awb(x) dz 


Thus finally 


(x mi — 7a] y = af (x, t)(Hzx — xH)xw(a,t) dx 


= f X(t paw bax (7.96) 
the last step following from Eq. (7.78). 


This is an example of the general rule: In writing the integral defi- 
nition of the transition element for a set of quantities corresponding to 
a succession of times, the corresponding operators are written in order 
from right to left, according to the order in time of the original transi- 
tion element. If there is a finite time interval At between them, a K, 
or alternatively the operator et@/™44* must be inserted. (For an ex- 
ample, see Prob. 7-16). As the time interval e between two successive 
quantities approaches zero the K approaches a Dirac delta function and 
the rule results. 





Problem 7-14 Show that the transition amplitude of 
(m/e) (£k+1 — Le) f(£k+1) is equivalent to that of (f - p). 


Problem 7-15 Show that the rule works for two successive 


momenta, that is, 
Y) = [frowtprv(e,t)dedy (7.97 


(x mts — L — Lk-1 
€ € 
2 | O* 
== | fx, gatlet) de dy 
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(x 


if t; = t and tk = s, provided t; > tk. What happens if t; < tk? 


Problem 7-16 Show that 


Sij- JJ” na s) 5 YUU. 8) dy de 


(7.98) 


k+1 7 
Li m — ok 








Notice that the square of the momentum p? corresponds to pp, or 
two. successive velocities times mass multiplied together (as in Prob. 
7-15). It does not correspond to the simple square of velocity at one time, 
(XIM? (k41 — £k)” /e7|W); for that goes to infinity as mh/te when e€ — 0, 
as we have seen in Sec. 7-3, particularly in Eq. (7.49). The difference 
between this expression mA /ie and the left-hand side of Eq. (7.97) is in 
fact p* in the limit. That is, 


5 (tee =) 4, .. 
fa) 


mh A ta) + (x mee Tk ok ae 





H) (7.99) 


Problem 7-17 Prove this, using Eq. (7.40) with 


Tk+1 — Tk 
F =m r 
€ 


THE PERTURBATION SERIES FOR A VECTOR POTENTIAL 


The singular behavior of the transition element of the square of the 
velocity, as shown in Eq. (7.49), has as a consequence the fact that 
many expressions involving velocity must be translated with care. For 
example, the lagrangian for a particle of charge e in an electromagnetic 
field is 


L(x,x,t) = z žl — e@(x, t) + “xA, t) (7.100) 


Let us take ¢ = 0 and ask for the effect of the vector potential A consid- 
ered as a perturbation. That is, with So = (m/2) f |x|? dt, 
o = (e/c) f x-A(x,t) dt, we develop a series for use in a perturbation 
treatment and solve for the resulting transition elements. Thus 

1 l 


(e a E (1) 5, + z (7) so —- Sar he E 18 it (7.101) 
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The first-order term is ie/fhc times the expression 


( | ewe it) (7.102) 


We wish to translate this to operator notation. In defining o for a 
discontinuous path (a series of steps of time length e) we might at first 
expect to write either 


= eH — xk) A (Xz, te) (7.103) 
Or 
e 
g=- N (Xb41 — Xk )*A(Xe41, te41) (7.104) 
Í k 


Either one, in the limit of a continuous path, gives the integral for o. 
But if we look at a particular component of A, say, Az, we find that 
Az (Xri1,tk+1) differs from Az(x,,t,) by approximately 
OA, 

Ot 
which, when multiplied by x,4) — X% again, might be expected to be of 
second order in e for each k, thus leading to a term of first order in € 
when the sum over k is performed. But our paths are not continuous 


and the transition element of the mean square of £k+1 — Zk is of first 
order. In fact (see Prob. 7-6) 


(Xk41 — Xk): V Áz + € (7.105) 


he 

(Le41 Tk) ; 

(Tk+1 — Tk) (Yer — Yk) ~ 0 
he 

(Yk+1 Yk) 


etc., to first order in e. Hence Eq. (7.103) differs from Eq. (7.104) by 
approximately 


Der Ate, = 2 fv ade (7.106) 
Cm 


a zero-order term. So it is imperative to decide which form is correct. 
The general answer to such a question was given in Chap. 2. There 
the rule given was that S is replaced by J`, Sei[@k41, tk+1; Tk, te], where 
Soi is the classical action to go from one point to a neighboring point. 
It is not necessary to calculate this action exactly, but only sufficiently 
closely to resolve ambiguities. Equations (7.103) and (7.104) are not 
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sufficiently closely calculated for this purpose, but the classical action 
for a short interval is very close to 


m |X -x e A(k+1)+A(k 
Sale +1 = GAEE + Clea = aa) (EE 


(7.107) 


Therefore, the correct expression for ø is the average of Eqs. (7.103) and 
(7.104), so that the transition element of Eq. (7.102) is 


(De — Xk) [A (Xk+1; tk+1) + A (Xk, a) (7.108) 


k 


Leaving the sum over k for later evaluation (as an integral over time) 
the result is the operator (1/2m)(p-A + A-p) (see Prob. 7-12). 

That is, in an electromagnetic potential, the first-order term in the 
perturbation expansion Eq. (7.101) has the same form as the first-order 
term given in Eq. (6.11), but with the quantity V replaced by the oper- 
ator —(e/2mc)(p-A + A-p). 

This conclusion is not true for the second-order term in Eq. (7.101). 
The second-order term requires our finding 


-5 (=) IES t) al) = (7.109) 
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Nothing special happens for the terms with j Æ k, and we obtain in fact 
precisely the second-order term expected by comparison to Eq. (6.13) 
with V replaced by the operator —(e/2mc)(p-A+A-p). But when 
j =k, the coincidence of the two velocities gives a new term. In view of 
Eq. (7.49) and Prob. 7-6 we get an additional quantity 


-5 E (By (A) (7.110) 
k 


which is equivalent to —i(e*/2hmc*) [[A(x,t)-A(x, ¢)] dt and has the 
same effect as the first-order action of a potential (e7/2mc”)A-A. 

Thus the perturbation expansion for the action of a vector potential 
has the same form as Eq. (6.17). The potential V is replaced by the 
operator —(e/2mc)(p-A + A-p) + (e7/2mc*)A-A. We have shown it 
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to second order in A, but a little consideration shows it is true to any 
order. 
The hamiltonian for a particle in a vector potential A is 


] e e 
H=—( - ŻA) ( - ŽA) ai 
a ee p-- (7.111) 
It differs from that of a free particle, namely (1/2m)p-p, by just this 
operator —(e/2mc)(p-A + A-p) + (e?/2mc*)A-A. This is a much easier 
way to arrive at the result we have just obtained. 


THE HAMILTONIAN 


Using what we have so far derived, it would be very easy to write 
down the transition amplitude for the hamiltonian. We take the tran- 
sition amplitude for the square of the momentum, divide it by 2m, and 
add the transition amplitude for the potential. In this way the hamilto- 
nian itself at the time tą could be written as 


H, = > Gara (Fi | +V (zp) (7.112) 


while in operator form we have the transition element of the hamiltonian 
as 


(X|H |) = [- x" pe + v(a) par = f X* Aw dz (7.113) 


2m BS 


Although this method for defining the transition amplitude for the 
hamiltonian gives a perfectly correct result, it is somewhat artificial, 
since it does not exhibit the important relationship between the hamil- 
tonian and time. Therefore, we shall next consider an alternative def- 
inition of this transition element based upon an investigation of the 
changes made in a state when it is displaced in time. This approach will 
also enable us to define Hg given only the form of S, no matter how 
complicated. 

To carry out this investigation, we break up the time axis into in- 
finitesimal intervals, just as we did in defining path integrals. Now, 
however, it is important to point out that the subdivision of time into 
equal intervals is not necessary. Clearly, any subdivision of time into 
equal intervals is not necessary. Any subdivision into instants t; will be 
satisfactory; the process of taking limits is characterized by having the 
largest spacing ¢;,,; — t; approach zero. 
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For simplicity, our system will consist of a single particle moving in 
one dimension. The action is represented by the sum 


S = ` Diketa La, by] (7.114) 
i 
where 
ti+i 
Stait tal = | Gat) dt (7.115) 
ti 


The integral in this expression is taken along the classical path between 
zi at t; and zi+ı at tj4,. For our one-dimensional example we can write, 
with sufficient accuracy, 


2 
m L; — T3 
Oltan a ul = E (7—5) = V(z;) 


(ti+1 = ti) (7.116) 
Gag 7, 





The normalizing factor associated with an integral over x; at the time 
t; is the same one we have used before, namely 


. =e 1/2 


The relation of H to the change in a state with displacement in time 
can now be studied. Consider a state w(t) specified within a space-time 
region A. Now imagine that at the same time t we consider another 
state w(t), specified within another region Rs. Suppose the region Rs 
is exactly the same as R except that it is earlier by a time ô, that is, 
displaced bodily towards the past by atime 6. All the apparatus required 
to prepare the system for Rs is identical to that for R but is operated 
a time 0 sooner. If the lagrangian L depends explicitly on time, it too 
must be displaced; i.e., the state Ys is obtained from the L used for the 
state Y except that in writing Ls we use the time variable t + ô. 

Now we ask: How does the state Ys differ from Y? In any measure- 
ment the chance of finding the system in some fixed region R’ is different 
depending on whether the region of origin was R or Rs. Consider the 
change in transition amplitude (X|1|wW5) produced by the shift in time ô. 
We can consider this shift as effected by decreasing all values of t; by 6 
for 1 < k and leaving all t; fixed for i > k. 

If the reader looks ahead at this point, it may appear to him that we 
are headed for trouble. Clearly, it is our intention eventually to take a 
limit as all infinitesimal time intervals are decreased to zero. However, 
with the present setup, at least one time interval tk+1ı — tk has a lower 
bound, so that it cannot be indefinitely decreased. This difficulty could 
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be straightened out by assuming the time shift 6 to be itself a function 
of time. We can imagine that it is turned on smoothly before t = t and 
turned off smoothly after t = tg. Then keeping the time variation of 6 
fixed, we can let all time intervals proceed smoothly to zero, including 
th+1—trh. We would then investigate the first-order effect of the time shift 
by letting the magnitude of 6 approach zero. The result obtained by this 
more rigorous process is essentially the same as that of the procedure 
we are using in our present example. 

Returning now to our investigation of the effect of the time shift we 
see that the action S[£i+1, ti+1; £i, ti] as defined by Eq. (7.115) will not 
change so long as both t;+ı and t; change by the same amount. On 
the other hand, S[£k+1, tk+1; £k, ty] changes to S|£k+1, tk+1; £k; te — ô|. 
Furthermore, the factor A associated with the integration over 2; is also 
altered and becomes 


, 7 1/2 
A= (See et’ ee 2) (7.118) 
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We use Eq. (7.2) to define the transition amplitude. Keeping in mind 
that the path integral depends on both the action S and the normalizing 
factor A (both of which are altered by our time shift) we can write the 
change in the transition amplitude to first order in 6 as 

7E 


(7.119) 
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the second term coming from the change in A. We wish to define the 
functional corresponding to the hamiltonian in quantum mechanics as 


_ Oo pias ii ec VE h 
Otk 2i(tk+1 — tr) 


The first term on the right-hand side of this last equation is the defini- 
tion of the classical hamiltonian. The second term is necessary in the 
quantum-mechanical definition in order to keep Hp finite as the time 
interval tk+ı — tk goes to zero. This last term is a consequence of the 
change in the normalizing factor A due to the time shift 6. 

Applying this result to the specific one-dimensional example indi- 
cated by Eq. (7.116), we can write the operator Hy, as 


2 
mS Epir = Ek A 
Hy = = | >=) + ——— + V¢a 

i 2 ( tk+1 = tk ) 2i(tk+1 = tk) ( r) 


= (Zr) (Zee) +V (zp) (7.121) 


2 \ teri — te tk — te-1 


Hh, (7.120) 
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The second of these equations is based upon the results obtained in 
Eq. (7.54). By writing the product of velocities as the product of two 
successive velocities, we can do away with the apparently extraneous 
term A/(2i(th41 — tk))- 
Using the relation ts = t — 6 for all values of t < tk, we have 
OW Ow 


Yl) = Plta) + b> = vs +O (7,122) 


connection the function Y% defined in the two regions R and Rs. Thus the 
cycle of relations connecting operators to the Schrodinger equation and 
to path integrals can be closed with the result obtained by combining 
Eqs. (7.119), (7.120), and (7.122): 





Ow 10 
—~d 1 — ) = (X\Axlw) — 123 
(a 22) = om (7.123) 
which leads us back again to the Schrödinger equation 
Ow a 
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For arbitrarily complicated actions we can find an expression for the 
hamiltonian (i.e., a functional corresponding to the energy) by asking for 
the first-order change in the transition amplitude (X|1|7) when all times 
previous to t are shifted by —ô and writing this change as (X|H(t)|W)O6. 


Harmonic Oscillators 


THE problem of the harmonic oscillator is perhaps the simplest in quan- 
tum mechanics. We solved it completely in Prob. 3-8 when we found 
that the kernel for the motion of a harmonic oscillator is 


mw 1/2 
27th sin wT 


tmwW ( 
2A sin wT 


K(xp,T:%q,0) = ( (8.1) 


x exp ve + 27) coswT — 2004) 

If we are to make full use of this, we should look at all sorts of problems 
which involve harmonic oscillators, either exactly or approximately. It 
is the purpose of this chapter to describe several such problems, both 
those involving single oscillators and those involving systems of interact- 
ing harmonic oscillators. We could carry this program to extremes and 
include all kinds of classical vibration problems (plates, rods, etc.), but 
such systems are so large that it would be a waste of time to analyze the 
quantum-mechanical corrections. Instead, it would be better to look at 
systems on the atomic scale. For example, we might analyze the oscilla- 
tions of the molecule CO. In so doing, we find that the potential energy 
between the carbon and oxygen atoms is not exactly quadratic. Never- 
theless, for the lower-energy states the potential is so close to quadratic 
that a pure harmonic oscillator treatment is a good approximation for 
many purposes. 

In a much more complicated polyatomic molecule, when the excita- 
tion energy is not too high, the travel of the atoms is small compared 
with their spacing. In this case again the potential energy is very nearly a 
quadratic function of the coordinates. Thus the system is approximately 
equivalent to a set of coupled harmonic oscillators. A solid crystal is, 
from one point of view, a polyatomic molecule of great size. As such it 
is a vast array of interacting harmonic oscillators. 

As another example we can consider the electromagnetic field in 
a cavity. Classically, there are several patterns of standing waves, or 
modes, in which the field can vibrate harmonically with a definite fre- 
quency. In quantum mechanics, each of these modes constitutes a quan- 
tum oscillator. 


THE SIMPLE HARMONIC OSCILLATOR 


Solution from the Schrodinger Equation. In this section we 
shall develop a number of relations describing the simple one-dimensional 
harmonic oscillator. We shall begin with the language of the Schrodinger 
equation. Problem 2-2 gave the lagrangian describing a one-dimensional 
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harmonic oscillator as 
a ae 
L 3 (tf Sra") (8.2) 


The corresponding hamiltonian, which we use in the present treatment, 
is 
2 


p M 32 
m 8.3 
Im 2 (8.3) 
The wave equation is then 
Ow i Se Oe 
ee ee ee, Se } 8.4 
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Since the hamiltonian is independent of time, the wave equation is easily 
separated, and it yields wave functions of steady states of definite energy 
En. The time-dependent part is proportional to e7 0/®) Ent, 

Recalling that the momentum operator p corresponds to differen- 
tiation with respect to x (see Sec. 7-5), we can write the Schrödinger 
equation for the spatial part of the wave function as 

2 2 2 
Hola) = ~~ 5 }q(2) + anle) = EnGn() (8.5) 





This equation is easily solved. The result is given in many books on 
quantum mechanics.’ The eigenvalues for the energy are 


En = ħw(n + 3) (8.6) 


where n is an integer: 0, 1, 2,.... The eigenfunctions ¢,(x) are 


1 mwy 1/4 | MW 2 
eee eee Cs rae —(mw /2h)x 


where the functions H, are the Hermite polynomials 


Ao(y) = 1 
Hı (y) = 2y 
Ho(y) = 4y* — 2 


y? a” 2 


Hay) = (e Soe (8.8) 


IL.I. Schiff, “Quantum Mechanics,” 2nd ed., McGraw-Hill Book Company, New 
York, 1955. 
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The Hermite polynomials are best defined by their generating func- 
tion 


= < t” 
et tty — 2, O (8.9) 
n= 


We can obtain these results in another manner. The functions ¢,(z) 
have been obtained by solving a differential equation, namely, the time- 
independent case. However, we already have a solution for the time- 
dependent case. From this solution we should be able to derive these 
functions directly. It is instructive to carry out this derivation to illus- 
trate some of the formulas which have been derived in earlier chapters. 


Solution from the Kernel. We have worked out the kernel de- 
scribing the motion of an oscillator in Prob. 3-8. And we know from 
Eq. (4.59) that this kernel can be expanded in terms of energy eigen- 
functions. That is 


1/2 
fetta we =. (oe eer Zeta} 


2rih sin wT 2ħ sin wT 
aN VE a (a4) 6% (a) (8.10) 
n=0 


Using the relations 


isinwT = de? (1 — e #7) 


ae al At es) (8.11) 


l= Nle 
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we can write the left-hand side of Eq. (8.10) as 


1/2 
( / ieee al E Tan (8.12) 
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EAN -re C + za) G _e-2wT J] — e-2twT 
We can obtain a series with the form of the right-hand side of Eq. (8.10) 
by expanding Eq. (8.12) in powers of e~**?. Because of the initial factor 


e~T/2 the terms in the expansion will be of the form e7#7/2e7*”¥F 
for n = 0,1,2,.... This means the energy levels are given by 


En = hu(n + $) (8.13) 


To find the wave functions, we must carry out the expansion com- 
pletely. We shall illustrate the method by going only as far as n = 2. 
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Expanding the left-hand side of Eq. (8.10) to this order we have 
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From this we can pick out the coefficient of the lowest term. It is 


(=) 1/2 oe wT/2,—(mu/2h) (x, +23) Te e C/REoT p (rooi (a) (8.16) 
T 
This means that Eo = thw and 
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= | — e` 8.17 
bola) = (FR) (8.17) 
We have chosen ¢9(x) to be real. We could make it complex by including 
a factor etf, where 6 is a real constant; however, it would make no 
difference to any physical result. 

The next-order term in the expansion is 
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The next term corresponds to Ez = Shw. The part of the term 


depending on x» and Ta is 
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This must be the same as (£b) (£a). Since the expression in the 
brackets can be rewritten as 


1 f2mw » 2MW 9 
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$2(z) = = es p 1) po(z) (8.22) 


These results are the same as those obtained from the solution of the 
energy wave equation, Eqs. (8.7) and (8.8). 

All of the wave functions may be obtained in this manner. How- 
ever, it is a difficult algebraic problem to get the general form for n(x) 
directly from this expansion. A less direct way is illustrated in the prob- 
lem. 


Problem 8-1 The amplitude to go from any state w(x) to another 
state X(x) is the transition amplitude (X|1|~) as defined in Eq. (7.1). 

Suppose w(x) and X(x) are expanded in terms of the orthogonal 
functions ¢,(x), the energy solutions to the wave equation associated 
with the kernel K(b,a), as discussed in Sec. 4-2. Thus 


=S Undn(z) — X(t) = 32 Xndbn(2) (8.23) 


Using the coefficients Yn and Xn and Eq. (4.59), show that the transition 
amplitude can be written as 


F a (a) (Th T tasr OV (Gs) dG diy Xn pe —(i/A) En T 
(8.24) 


Next, suppose we choose a special pair of functions w(x) and X(x) for 
which the expansion on the right-hand side of Eq. (8.24) is simple. Then 
after obtaining the functions Yn we could get some information about 
the wave functions ¢,(xz) from the expansions of Eq. (8.23). Suppose 
we choose the functions y(x) and X(x) in the following way 


muw\ 1/4 _ as uae 

p(z) = (=) e~ (mw/2ħ)(s—a) (8.25) 
1/4 2 

X(x) zs (=) oe (mw /2h)(x—b) (8.26) 


These functions represent gaussian distributions centered about a and 
b respectively. We shall call Y, = Yn(a) and Xn = Yn(b). Determine 
the transition amplitude (X|1|W), where w(x) and X(x) are given by 
Eqs. (8.25) and (8.26), and the kernel is that for a harmonic oscillator, 
Eq. (8.1). Perform the integrals in Eq. (8.24) to get 


T 
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(8.27) 
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From this result show that En = hw(n + +) and that 
2 


mu\n/2 a” Mwa 
Yna) = (SF) Tad- Ah (8.28) 


Use this result in Eq. (8.23) and write for n(x) the form given by 
Eq. (8.7) considering the H,,(y) still unknown. From this derive the 
generating function of Eq. (8.9) for these functions H,,(y). 





THE POLYATOMIC MOLECULE 


In the preceding section we derived the wave functions and energy levels 
which describe the simple harmonic oscillator. In this section we begin 
our investigation of systems of interacting oscillators with the study of 
polyatomic molecules. We begin the analysis by assigning coordinates 
describing the position of each atom in the molecule. The position of 
any particular atom a will be given by the three cartesian coordinates 
La, Ya, and Za, whose origin lies at the equilibrium position for the atom. 
If the mass of the atom is Ma, the kinetic energy of the whole molecule 
is given by 


1 | 
DD 5 Malta + Va = Ža) (8.29) 


where the summation is carried out over all atoms in the molecule. 

It will be more convenient for this general discussion to avoid the 
vector aspects of this description by making the following modification. 
We suppose there are N atoms in the molecule. We shall define n = 3N 
coordinates in the following way: 


i= JM La q2 = Ma Ya q3 = y Ma “a (8.30) 
q4 = Mp Xp q5 = y Meb Yb : 


In terms of these new coordinates the kinetic energy is 
lo 
KE. = 5 S| & (8.31) 
j=1 


The potential energy is the function V(q1,g2,...,@n) of all the dis- 
placements g;. We can expand V in a Taylor series around the equilib- 
rium position g; = 0. Thus 
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V (q1, a; pezan Qn) = V(0,0, ease , 0) + Š a; V;(0,0, pay 0) (8.32) 
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The first term is the potential energy at equilibrium. It is a constant 
independent of the q;. We shall assign it the value zero by shifting 
the zero-level of potential energy. The second term contains the factor 
V;(0,0,...,0), which is the potential gradient or force associated with 
the coordinate q; and evaluated at equilibrium position. This factor is 
therefore zero. To put this another way, since equilibrium corresponds to 
a minimum of potential energy, the first-order change for displacements 
about equilibrium must vanish. 

The factors V;,(0,0,...,0) appearing in the third term comprise a set 
of constants whose values depend on the structure of the molecule. Call 
these constants v;,. Now suppose we neglect all higher-order terms. 
In this approximation the potential energy involves each coordinate 
quadratically. Even if the potential is not a pure quadratic function 
of the coordinates, our approximation will be valid for small displace- 
ments. It is by this approximation that we represent our molecule as a 
system of harmonic oscillators. 

Combining Eqs. (8.31) and (8.32), we can write the lagrangian as 


“34 — — Ly aa (8.34) 


a E 


Next, we introduce this lagrangian into the path integral which defines 


the kernel? describing the motion of the atoms in the molecule, 
4 n l n n 

K= fo [fot PILOLEEDP) va Fn) 

x Dai (t) Daa(t) -Dan (t) (8.35) 


All of these path integrals are gaussian, and thus they can be solved 
by the methods discussed in Sec. 3-5. To carry out that solution, we 
shall have to find those paths ¢,(t) which give a stationary value for the 
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action integral. Variation with respect to each q;(t) gives these paths as 
solutions of | 


a 3 Vik Up, (t) (8.36) 


This last equation says that the force on any single atom in a particu- 
lar direction is some linear combination of the displacements of all the 
atoms. 

Such systems of interacting oscillators have been analyzed to a great 
extent from a classical point of view. Since in many problems of quan- 
tum mechanics we obtain the classical action as the first step in solving 
the kernel, all of this classical work is of great value to us. One impor- 
tant result of the classical analysis is the following. There are special 
ways to distort the molecule so that, as time goes on, the motion is of 
the simple periodic sinusoidal type. The pattern of distortions remains 
the same, and only the amount of the distortion varies sinusoidally with 
time. Different patterns of distortion, or, as we say, different modes, 
correspond in general to different frequencies. There may be some with 
zero frequency, and some groups of modes may all have the same fre- 
quency. The important fact is this: Any small displacement motion of 
the molecule can be built up as a linear combination of such modes. 
This kind of motion is called a normal mode. 

If there are N atoms in the molecule, then the molecule has n = 3N 
modes of motion. Thus, for example, the molecule CO% has nine modes, 
as shown by Fig. 8-1, where the motion of each atom is indicated by an 
arrow. Only modes 1 to 4 are periodic (i.e., have a non-zero frequency) 
and the direction of motion during the Gre half-cycle is indicated. For 
the second half-cycle, reverse all arrows. 
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Fig. 8-1 Normal modes of the CO2 molecule. The symbol (-) means motion 
out of the plane of the paper, and OZA) means motion into the plane. Modes 1 
to 4 are periodic; modes 5 to 7 are continuous translations; and modes 8 and 
9 are continuous rotations. 
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We shall next derive the mathematical description of the modes. 
This derivation is, of course, part of classical physics rather than quan- 
tum mechanics. Consider a particular mode of frequency w. All of the 
coordinates q;(t) move together and at the same frequency. There must 
be some special set of initial displacements a,, different for each mode, 
such that if all initial velocities are zero, the subsequent motion of any 
coordinate can be written as 


q;(t) = a; cos wt (8.37) 


Substituting this equation into Eq. (8.36) gives 


n 


Waj = N Ujka (8.38) 
k=l 


This last formula is actually as set of n equations for the n unknowns 
aj. Since it is homogeneous, it has a solution only if the determinant of 
coefficients vanishes. Thus we require 
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This equation has n solutions for w*. For a particular solution, say wa, 
we can get solutions for the set of equations (8.38). We shall call these 
Qjq. The sizes of the solutions aja are determined relative to each other, 
but the overall magnitude of the whole set is arbitrary. We shall choose 
this magnitude so that 


aaa (8.40) 
j=l 


We can repeat this process for all of the n modes, œ = 1,2,...,n. We 
determine n values of wa, and for each value of œ we obtain the solutions 
for the n constants aja. 

Any possible motion of the system is a linear combination of these 
modes. We can write an expression for a general type of motion as 


g(t) = X Caaja cos(wat + ba) (8.41) 


o=1 


Here the constant of amplitude C, and the constant phase 6, depend 
on the initial conditions. That such an expression does represent the 
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motion of the system is easily verified by substituting Eq. (8.41) into 
Eq. (8.36). 
It is convenient to use the complex notation for Eq. (8.41). That is, 


= Re ps ORTE “oe? = Re p tint | (8.42) 
a=l 


a=l1 


The complex constants cg depend on the initial conditions, and they 
can be determined as follows. If the initial positions and velocities are 
q;(0) and q;(0), respectively, we have 


= Re ps csaja} = DD Red Ca Yaja (8.43) 
a=1 a=l 

= Re D atta = — ` Sit catia 
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a=l 


Since the constants aj, are all real, this pair of equations determines 
both the real and imaginary parts of cy. 

We can solve Eqs. (8.43) in a simple way by using an important 
property to be expressed in Eq. (8.48), which we now derive. For any 
particular a the constants aja satisfy 


tye Se pa (8.44) 
k=1 


If we multiply this equation by ajg and sum over all values of j, we find 


wa i Daven = = 5 aT (8.45) 


kal j=! 


Since the coefficients vj are symmetrical, the left-hand side of Eq. (8.45) 
will be the same if œ and @ are interchanged. This means : 


(w2 — we) X ajaaje = 0 (8.46) 
j=l 


Thus if the frequencies wa and wg are different, it must be that 

n 
Š aui = 0 (8.47) 
If the two frequencies are the same, then the constants ajq are not 


determinate. Instead, we have the freedom to make an arbitrary choice 
which can be made in such a way that Eq. (8.47) is satisfied for a 4 8. 
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Thus finally, making us of the normalization established in Eq. (8.40), 
we can write 


S| üü = Oa (8.48) 


where dag is the Kronecker delta. 

We can now easily find the real part of c from Eqs. (8.43). Multiply 
the first of Eqs. (8.43) by ajg and sum over values of j. All terms on 
the right-hand side vanish except that for a = 8, which gives 


Re{cg} = $ aja; (0) (8.49) 
j=l 
In a similar manner we can find 
i, es | 
mica} = n >_ 2584; (0) (8.50) 


Thus a complete description of any arbitrary motion of the system can 
be determined from a knowledge of the normal modes of the system and 
the initial conditions of the motion. 


NORMAL COORDINATES 


We can analyze the motion of the system in another way. Let us choose a 
new set of coordinates Qa(t), which are a particular linear combination 
of the old coordinates, namely 


=) ajag (t) (8.51) 
J= 
Alternatively, the old coordinates can be given in terms of the new by 
t) = Š aj0Qa(t) (8.52) 
a=1 


Using Eq. (8.48), we can write the kinetic energy as 


=g siy D 3 Aj00;8Qa06 = = P Q? (8.53) 


Jz 1a=1 8=1 
The me energy is 


n TL TL 


v=; P orya =Y DD D veajan (8.54) 
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From Eq. (8.38) we have 


n 


X Ujkakg = W546 (8.55) 
k=1 


which means that the potential energy can be written as (using Eq. 8.48) 


=i5 3 RaR 3 al = + waa (8.56) 


as D=] 


Thus the lagrangian of Eq. (8.34) can be written in terms of the new 
variables as 


L=257(0% -u2Q2) (8.57) 


a=l 


The lagrangian in this form represents a set of harmonic oscillators 
which no longer interact. That is, the variables are separated. Each os- 
cillator has unit mass and its own particular frequency wa. The equation 
of motion for a particular oscillator is 


Qa(t) = -waa (t) (8.58) 


This means that each mode oscillates freely at its own frequency inde- 
pendent of any other mode. By comparing Eqs. (8.49) and (8.50) with 
(8.51) we see that the real part of cg and the imaginary part of —cgwg 
are just the initial coordinate Qg(0) and the initial velocity Qg(0), re- 
spectively, of the @ mode. ‘Thus the complicated molecule is equivalent 
to a simple set of independent harmonic oscillators. 

This new set of coordinates Qa, which permits us to describe the 
system as a set of independent oscillators, is called a set of normal 
coordinates. Using the lagrangian given by Eq. (8.57), we can write the 
path integral describing the motion of the system in terms of normal 
coordinates as 


Kefe TE = | ea (2 -uža DQi(t) ++ DQn(t) 
(8.59) 


This last result can be obtained directly from Eq. (8.35) by the ex- 
plicit substitution q; (t) = 3°, @jaQa(t). The exponent simplifies just as 
in the classical case, while Dg, -Dan = DQ1--:-DQn, at least within 
a constant factor. (Since the transformation of coordinates is linear, the 
jacobian is constant. Any such constant can be absorbed within the 
definition of the normalizing factors for the path integral DQ, ---DQn.) 
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This form of the path integral can be broken down into a product of 
path integrals. Thus 


a I] fooi [laze rar) ar} DQ. (8.60) 


where each path integral now describes only one mode and each mode is 
a simple one-dimensional oscillator, for which we have already obtained a 
solution. In this manner any problem of interacting harmonic oscillators 
can be analyzed. 

Since the path integral for the kernel can be separated into a product 
of path integrals, it follows that the wave function for the system in a 
given energy state can be written as the product of wave functions of 
each mode as discussed in Sec. 3-8. 

As shown in Sec. 8-1, the energy wave functions for each sepa- 
rate mode are proportional to e~(/*)#nt where En is the energy of 
the mode. A product of such wave functions is then proportional to 
exp{—(i/h)()_,, En)t} From this it follows that the total energy of the 
system of oscillators is equal to the sum of all the separate energies. 
The energy in the œ mode is hwe (mo + +), where mq is an integer. The 
energy of the whole system is then 


E = ħwi (mi + $) + hwo(me +5) +--+ + hwn(mn + $) (8.61) 


where M1, M2, ..., Mn are all integers (including zero). All independent 
choices are allowable because the excitation of oscillator 1 and oscillator 
2 can be in different degrees. 

If ém(Q) is the harmonic oscillator wave function for the mth energy 
eigenstate, then the wave function for the complete system is 


bry (Q1)Pma(@Q2) +: bmn (Qn) = [|] om. (Qa) (8.62) 


Each ¢m, (Qa) is as given in Eq. (8.7) with w replaced by wa. In this 
way Classical physics, whereby we determine the normal modes, and 
quantum mechanics, whereby we determine the energy levels and wave 
functions for a single mode, are combined to give a complete solution 
for the energy levels and eigenfunctions of a polyatomic molecule. 

We can express the wave functions in terms of the original coordi- 
nates q;(t) by using the transformation equations (8.51). For example, 


the lowest energy state of a system, which has the energy 5 DA Ns 


a=1 
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has the (unnormalized) wave function 


n wa Q? 1 n n 
Po = JI exp [29a | = exp l-5 408 


a= 


= exp — `S ` X Wwaljalkaljik (8.63) 


a=1j=1k=1 | 
That is, the wave function is an exponential function of the quadratic 
Tr 


1 n 
form a D9 ` Mjn9j%% where the matrix element M;, is 
j=l k=1 | 


ji n 
Mj, = 7 2 Wairaka (8.64) 


a=] 


Problem 8-2 Show that the matrix 
= AjaBka 
ee > E a 
Q (04 


is the reciprocal square root of the v;, matrix. That is, show 


>. 2 TjlTimUmk = jk (8.65) 


It may happen that some of the frequencies wą are zero. For example, 
for the molecule CO2 the modes 5 to 9, as pictured in Fig. 8-1, all have 
frequency zero. They correspond to a translation or a rotation of the 
whole molecule, motions for which there is no restoring force. Since 
there is no restoring force, the assumption that the coordinates Qa are 
small is not generally true. A more exact analysis of the translation or 
rotation kinetic energy must be undertaken. Since such motions are not 
of interest in the present discussion, we shall assume that these modes, 
and their coordinates, either do not exist or are never excited, so that we 
have been dealing with modes for which wa Æ 0. If for particular values 
of a the solutions w come out negative (so that wa is imaginary), the 
system is in unstable equilibrium for motions in this mode, like a pencil 
balanced on its point. Instead of being simple harmonic, the motion 
is exponentially divergent, and again the coordinates Qa do not stay 
small. This case again is of no interest in the present discussion, and we 
shall assume that there are no such modes. 


8-4 
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THE ONE-DIMENSIONAL CRYSTAL 


A Simple Model. We can think of a crystal as a large polyatomic 
molecule spread out in a three-dimensional array. We can begin learning 
about it by first studying a simpler one-dimensional line of equal atoms 
equally spaced, as in Fig. 8-2. Let the mass of each atom be m and 
let the displacement of the jth atom from its equilibrium position be 
q;/\/m. We suppose that the motions are restricted to lie along the line 
of the array, i.e., longitudinal motions only. Next, suppose each atom 
interacts only with its two neighbors, and that the potential energy of 
interaction between a pair of adjacent atoms separated by a distance 
Ris V(R). That is, we suppose the atoms are connected together by 
a set of springs. The equilibrium separation gives a minimum value 
to the potential. We shall assign this minimum the value 0. Suppose 
AR is the difference between the equilibrium displacement and some 
particular displacement. We can expand the potential in a power series 
in terms of AR, in a manner analogous to that of Eq. (8.32). We shall 
restrict our attention to those displacements which are so small that all 
terms higher than the second order in this expansion can be neglected. 
Between the jth and (j-+1)st atoms the change in separation away from 
the equilibrium separation is (qj+1 — q;)/ ym = AR; ;j+1. We shall call 
the second derivative of the potential with respect to the displacement 
mv? (the same for all atoms in the string). Then the potential energy 
associated with this pair is 


Vigtl = 2v (G44 =a) (8.66) 
and the lagrangian can be written as 
TEN ME N-1 
= > df — my 2 Gj+1 — G (8.67) 


If the first and last atoms are unattached, then the term for 7 = N in 
the expression for potential energy must be omitted. 


OSCE ONNI r r o nr rnnr r onn wnn o- = = 


Fig. 8-2 A model of a one-dimensional “crystal,” with mass particles evenly 
spaced along a line and springs connecting neighboring particles. 
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Based on this lagrangian the classical equations of motion for the 
atoms along the line are 


dy (t) = v* [az (t) — a(t) — (y(t) — Ga t) 
= v*[ajsi(t) — 24; (t) + gi (0) (8.68) 


for all j except the end points j = 1 and j = N. Now this fact that 
the end particles have to be given separate consideration is just a mi- 
nor annoyance for most problems. Usually we are interested in the gross 
properties for a large solid and are not concerned with surface or bound- 
ary effects. In such cases the main results desired are really independent 
of the actual boundary conditions (e.g., whether or not the end atoms 
are left free or are tied down, etc.). To avoid this problem, theoretical 
physicists use a trick of assuming a special set of simple boundary con- 
ditions, called periodic boundary conditions, so that these end points do 
not require special consideration in the analysis. Unfortunately, these 
special boundary conditions occur in actuality only rarely, if at all, but 
for phenomena which are independent of boundary effects the trick is 
useful. 

The idea is to imagine that the string of atoms goes on beyond N, 
but a displacement of the (N+ 7)th atom is always exactly equal to that 
of the 7th atom. Thus the boundary condition is 


gn+i(t) = u(t) gn+i(t) = q(t) (8.69) 


This boundary condition would be right if our string were tied in a circle 
like a pearl necklace. However, in three dimensions there is no such 
picture to represent the boundary condition, and it must be considered 
completely artificial. 

The value of this particular boundary condition is this. Most gen- 
eral ways of terminating the string (e.g., tying the last atom to a rigid 
wall, leaving the last atom free, etc.) result in a reflection of any wave 
traveling down the string. Only if the last atom is tied to another string 
of atoms of identical characteristics will no such reflection occur. Thus 
the boundary condition is analogous to tying a transmission line to a 
characteristic impedance in order to avoid reflections. The characteristic 
impedance is equivalent to an infinity of more line. In the present case 
we accomplish this end by tying the string to itself. We call these bound- 
ary conditions periodic, because anything the happens at the point k in 
the string is repeated again at the point N + and again at 2N + k, etc. 
With this boundary condition, Eq. (8.68) for the motion of the atoms is 
valid for all of the atoms. 
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Solving the Classical Equations of Motion. We assume that 
the displacements g(t) are periodic with frequency w. Then we must 
solve 
—w g(t) = v’ lgj (t) — 2; (t) + 5-1) (8.70) 


We could write down this set of equations in a determinant, and it turns 
out that the determinantal equation so obtained can be evaluated by 
theorems in mathematics. But this just means that the equations can 
be solved directly, and it is easier to solve them that way. 

We shall restrict the symbol 7 to mean y—1, and not use it for an 
index. Normal mode solutions are of the form 


q;(t) = Re{ Ae **I-#)) Relat] (8.71) 


where K is a constant taking on a discrete set of values. This solution 
can be verified by substitution into Eq. (8.70). The frequency is given 
by 


| | K 
aay le = 2 Ae) a = s {872) 


This gives the values of w in terms of K, but not all values of K are 
allowed. The periodic boundary condition implies that K = 27a/N, 
where a = 0,1,2,...,N—1. (The case a = 0 is just a translation, and 
we can omit it if desired. Furthermore, the case given by a’ = N +a 
is the same as the case given by a.) Thus for any particular choice of a 
we have the frequency 


Os. SS 2y 





, TQ 
SIn N | (8.73) 
and the amplitude for the 7th coordinate at that frequency is 
G22 Ae AAN (8.74) 


The constants aja determined in the last equation are complex. They 
could be made real by combining solutions for a and —q@ (or œ and 
N — a). However, it is more convenient to leave them in the complex 
form. It is also sometimes convenient to consider both positive and 
negative values of a, and so if N is odd, for example, to consider the 
range of a to be —i(N — 1) to +3(N — 1) rather than 0 to N —1. 

The relative displacements of the atoms in the string depend on the 
size of a. The situation for two values of a, one with œ small and the 
other with a = N/2, is shown in Fig. 8.3. 
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a small 


displacement q; 





& = N/2 


displacement q F 


Fig. 8-3 The displacement of atoms along a string is plotted as the ordinate 
against the equilibrium positions 7 equally spaced along the abscissa. In the 
upper case the wavelength is long compared to the spacing between atoms (a 
small). In the lower case a = N/2, and the displacements no longer give the 
appearance of a smooth sine wave. 


Although the relative magnitudes of the various constants aj are 
determined by Eq. (8.74), the overall magnitude, determined by the 
constant A, is still arbitrary. We establish this with a normalizing equa- 
tion analogous to Eq. (8.48). Thus choose A so that 


N 
pa Gjalp = ag (8.75) 
j=l | 
which implies that 

1 
A=— (8.76) 


VN 
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We are now in a position to represent the various modes with their 
normal coordinates as 


N N 
1 —i2raj/N 
= X ajag (t) = = X g (t)e™ (8.77) 
j=l N j=1 


where an arbitrary motion is q;(t > Cajal wat \ in analogy 


with Eq. (8.42). These normal coordinates are ‘ae complex, but we can 
ensure that the lagrangian derived from them is real by writing it as 


L= is 5 (00x uQ) (8.78) 
a=0 

Perhaps this use of complex coordinates Qa needs a word of expla- 
nation. Since g;, the physical coordinates, are real, Eq. (8.77) implies 
Q% = Q_, so that, although two real numbers are required to spec- 
ify each complex coordinate Qa, only N independent real numbers are 
needed for all of them. If one prefers real coordinates, one can define 
instead two real quantities as coordinates by writing 


Qa = Ae iQ) 

Q% = (Qa + Q-a) a 
E 

Qo = 75 a Q-a) (8.80) 

A term such as the kinetic energy is expressed in real variables as 

IOR + (@8)7] = QaRa = QaRa (8.81) 


(The factor 5 reappears in Eq. (8.78) because we sum there over all a, 
plus and minus, counting thereby each term twice, Q* Q-a = Q,Q3.) 
Thus quadratic expressions derived previously for real quantities appear 
now as products of one complex number by its conjugate (for example, 
Eq. 8.75). 


Problem 8-3 Show that QS, QS, are normal coordinates corre- 


sponding to standing wave normal modes cos(27aj/N) and sin(27aj/N), 
in the sense that (for N ne 


-1)/2 | 
ult) = |5 ry CHO cos SOF _ Q3 (t) sin T 








(8.82) 
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Problem 8-4° Show that the ground-state wave function for the 
lagrangian of Eq. (8.78) can be written 


N-11 : 
l x 
o = Á exp -5 S 220.04} (8.83) 
a=1 
(where A is a constant) by starting with the wave function in terms of 


the real variables QS, and Q$.. 


Problem 8-5 A transition element which employs the same wave 
function as both the initial and final states is called an expectation value.' 
Thus the expectation value of F for the ground state ®g of Eq. (8.83) is 


(dol Fdo) = JJ] B* FS, dQ, dQy---dQy-1 (8.84) 


(The integral over complex variables is defined as equal to the corre- 
sponding integral over real normal coordiantes QS, and Q$.) Show that 
the following expectation values® are correct (for a Æ 0): 


(Po|Q_iPo) = (04120) = 0 

(B9|Q2|Go) = (0/Q%"|Go) = 0 
(®0/Q3Q_|Po) = =~ (®o|1|o) (8.85) 
(D91Q5.Q,|20) =0 ifa#£ 


Thus with the lagrangian written in terms of normal coordinates 
we have reduced the system to a set of independent simple harmonic 
oscillators. The quantum-mechanical part of the solution follows in a 
straightforward manner just as it did for the case of the polyatomic 
molecule. All that we need to know is the quantum-mechanical solution 
for an independent simple harmonic oscillator. 


Problem 8-6 Show that the constants aja are the same even if the 
coupling is not just to the nearest neighbors but extents with strengths 
Ap, to atoms k spaces away. Assuming A, falls rapidly enough for large k, 
find the values of the frequency wa when such a coupling is present, i.e., 
when the potential energy, instead of being given by Eq. (8.66), is given 
by a similar equation, but one which contains the relative displacements 
of all pairs of atoms, each one multiplied by the appropriate Az, that is, 
V = (v?/2) Sd Arli- g). 


i k 


1Compare this definition of expectation value with the definition of the expected 
value of an operator given in Sec. 5-3, particularly in Eq. (5.46). 
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THE APPROXIMATION OF CONTINUITY 


The particular modes which we have determined here are those in which 
each atom oscillates with a phase difference behind the one next in line. 
There is a wave of oscillation passing down the line of atoms. If the 
phase difference between adjacent atoms is small, then the wavelength 
is long. 

Of special interest is the behavior of the atoms in the long-wavelength 
modes. If the wavelength greatly exceeds the spacing between atoms, 
this spacing is unimportant. In this case the motion can be very well de- 
scribed by the fictitious “continuous medium” concept. A line of atoms 
can be replaced by a continuous rod with certain average properties, 
such as the mass per unit length p = m/d. More physically, a real rod 
is actually a discrete set of atoms. In this section we shall develop the 
approximation of continuity, wherein a line of atoms is replaced by a 
continuous string. 

For a particular mode of motion the phase difference between adja- 
cent atoms is 27a/N, so that a wavelength contains N/a atoms, or if d 
is the equilibrium separation distance between neighboring atoms, the 
wavelength is A = Nd/a. The wave number is 


2T 270 
à Nd 

The wave aspect is made more clear in the mathematical represen- 
tation of the motion by a slight change of notation. We shall refer to 
each mode by its k value instead of by its a value. Then a summation’ 
a over the modes means a sum over discrete values of k. These values 
are the integers multiplied by 27/L, where L = Nd is the length of the 
string. Suppose z; = jd is the equilibrium position of the jth atom. 
Then the equations describing the motion of the atom become 


k = (8.86) 


os 
Qik = or (8.87) 
N 
Qr = — Y ye * (8.88) 
Agar 
eo 
ay N Qpe (8.89) 


1 


z 
Ii 
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and 


ve 


wk = 2v |sin a (8.90) 








We now assume that the separation between atoms is very small com- 
pared to the length over which disturbances change. Using the symbols 
we have already defined, such situations as this are described by kd < 1. 
If we call the product vd = c, then for kd small we have w & kc. In this 
situation we can think of the coordinates q; as being functions of posi- 
tion along the line of atoms. That is, we can specify the displacement 
of the jth atom, as shown in Fig. 8-3. For long waves the displacements 
q(x;) and q(#;41) are nearly equal, and we can consider the function 
q(x) as a smooth continuous function defining displacement as a func- 
tion of equilibrium position along the line. The normal coordinate @(k) 
is a Fourier transform of q(x). That is, Eq. (8.88) can be replaced by 


Q(k) = ef g(x)e~*** dz (8.91) 


This replacement is based on the approximate relation 


L 
Woe gf Cae (8.92) 


j=l 


which becomes more valid as the spacing between discrete points be- 
comes very small. 
A similar relation, namely, 


N L 2n /d 


S F ! ( )dk (8.93) 


k=1 
leads to the inverse transform 
2n/d 


2a ob- A okt 
1a) = =e | Q(k)e"* dk (8.94) 


To make these quantities of more direct physical significance, let the 
actual displacement of the j atom be u;. That is, g; = ym uj, where m 
is the mass of one atom and is equal to pd. Let the Fourier transform 
of u be U. Thus 


U (k) =f u(xz)e*** dz (8.95) 


while the inverse transform is 


u(x) =| U(k)e** = (8.96) 
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The new normal coordinate is then U(k), and it is related to the previous 
normal coordinate Q(k) by 


U(k) = === Qk) (8.97) 


The expression for the kinetic energy in terms of u(x,t) can be 
worked out with the help of Eq. (8.92) to be 


L 2 
p Ou 
E= — 8. 
K.E EJ ($) dx (8.98) 


To determine the potential energy in terms of all the new variables, 
we need to express the difference in the displacements of two adjacent 
atoms as a continuous function of position. Using our approximation of 
continuity, we can write 


Ou 
qi+1 — q; = Vm [u(zj+1, t) — ula;,b)| & Vm a (8.99) 
That means that the potential energy is 
Varr md*(2-) de=—] {a 8.1 
TA (2) n= (Z) dx (8.100) 


In the last equation we have used the constant c = vd. This constant 
is actually a measure of the elasticity. We can define it physically in 
the following manner: Suppose we stretch the line of atoms, which has 
length L, by a fractional increase of amount €, that is, to the new length 
L(1 + €). (We are considering a static stretch, not a vibration.) This 
means that we make the separation between each pair of atoms equal to 
d(1 + €) instead of d. Thus the difference in displacements of adjacent 
atoms becomes 


Qj+l = qj = ed,/m (8.101) 


Using Eq. (8.66), this means the potential energy put into the string by 
the stretching is 


2 2 
V = = ed'mN = OL (8.102) 


Thus the force that is needed to stretch the string is, in the limit of 
small e, 


OV 
= pee (8.103) 
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This last equation gives the stress in the string, while the strain (stretch 
per unit length) is of course e. Thus we have 


stress 





— = pc” = elastic constant (8.104) 
strain 


Combining Eqs. (8.98) and (8.100), we can construct the lagrangian 


as 
L 2 2 pL 2 
p Ou pc Ou , 
L= = TEA E T 
o f (BY at Y a 4.1085 
The fundamental modes which we are considering have the form 
e~**® and the normal coordinates are U(k,t). The reader can show 


that, for a long string, the lagrangian can be expressed in terms of these 
normal coordinates as 


sar a Ea yip pp E 
bat] | en ef MUO S (8.106) 


We can consider the system described by this lagrangian as a set of 
harmonic oscillators, one oscillator for each value of k. In our present 
approximation of continuity, k is a continuous variable with an infinite 
number of values. We can reintroduce the picture of discrete atoms by 
remembering that the integral over k is really a sum over discrete values 
of k, where the various discrete values of k are spaced a distance 27/L 
apart, with L the length of the string, and the number of such values is 
equal to the number of atoms in the string. 

We can get equations of motion in terms of the continuous variables 
by finding the extremum of the action integral k Ldt. Using the form 
of L given by Eq. (8.105), the resulting equation of motion is 


OW n ou 
Pe et PPE 
Following a line of argument demonstrated by Eq. (8.99), we can see 
that this equation of motion is analogous to the previous equation of 
motion which we derived, namely, Eq. (8.68). Equation (8.107) has the 
solution 








(8.107) 





u(az,t) =al (8.108) 
in analogy with Eq. (8.71), where 

2 
—w*a(x) = oot 2 (8.109) 


in analogy with Eq. (8.70), and 
a(x) =e" (8.110) 
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in analogy with Eq. (8.74). 

Combining Eqs. (8.109) and (8.110), we see that w = ke. This is the 
analogue of Eq. (8.90), and, as a matter of fact, in the limit of small k, 
Eq. (8.90) reduces to this relation. 

The motion described by Eq. (8.108), with the value of a(x) given 
by Eq. (8.110), is that of a traveling wave moving with velocity c. That 
is to say, c is the speed of sound along the line of atoms. Actually, a 
real system shows dispersion; that is, w is not exactly proportional to k. 
For wavelengths which are of the same order as the atomic spacing this 
lack of proportionality becomes important, as shown by Eq. (8.90). 


QUANTUM MECHANICS OF A LINE OF ATOMS 


The behavior of the atoms in a string can be described in terms of modes 
of motion. Each mode is a harmonic oscillator. The energy state of any 
particular mode is determined by the quantum number for that mode. 
Each mode is identified by its wave number k or its frequency w. A 
mode of frequency w can have the energy values thw, hw, hw, Spig 
or in other words 0, hw, 2hw, ...above the ground state energy zħw. 
For these cases we would say that there are 0, 1, 2,... phonons of wave 
number k (or frequency w) present. 

It is possible to have several different modes excited simultaneously. 
For example, we could have (1) the mode of wave number kı excited 
to its first level above the ground state, (2) the mode of wave number 
kg excited to its first level also, and (3) the mode of wave number kg 
excited to its second level above its ground state. The state of the 
complete system would then have the total energy A(wi + w2 + 2w3) 
above the ground energy. We would say that there are four phonons 
present: one phonon of wave number kı, one of wave number ko, and 
two of wave number kg. 

The ground state of the entire system has the energy 


Aw 
Eend = pD D (8.111) 
k 


Using the approximation of continuity (see Eq. 8.93) and letting 
w = kc, this becomes 


b Re 
Egnd = =— —_— | 
end D7 J. 5 dk (8 112) 


If the upper limit kmax on the integral over k goes to infinity, then the 
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integral diverges. However, the form w = ke used in this expression is 
valid only for long waves (i.e., small values of k). 

We can make a better determination of the ground-state energy by 
using the correct expression for w and establishing a reasonable upper 
limit for the integral over k. Thus, using Eq. (8.90) for wg, we can write 
the ground-state energy as | 


km ax 








kd 
| eee = in — . 
sud ` Av |sin 5 | (8.113) 
k=—kmax 
where 
T 
kaas e 8.114 
: (8.114) 
This can be rewritten as 
N/2 
fee ` Av in| = = 2hv Sm yen : (8.115) 
a=—N/2 


For a very large N this sum can be approximated by an integral to give 


N _ 2AcL 
end = = 2hv— Ps ade 


This result shows that the energy is proportional to the length of the 
string, but apparently it has no limit as the spacing d approaches zero. 
That is, the ground-state energy is infinite for a continuous medium. Of 
course, for real matter the energy is finite. 

It is very convenient to measure, instead of the total energy, the 
excess energy above the ground state. There are two reasons for this: 
(1) Really, the ground-state energy is not known, nor is it usually in- 
teresting to the physical problem in question. For example, the true 
ground-state energy includes all of the energy of the electrons attached 
to the atoms. (2) When dealing with the excitation of only long waves, 
the approximation of continuity is very useful, and it gives a good ap- 
proximation to the excitation energies. However, this approximation 
gives an invalid result for the ground-state energy, since it neglects the 
separation d (i.e., treats d as 0). Thus we must avoid the necessity of 
evaluating the ground-state energy if we are to use the approximation 
of continuity. 





(8.116) 
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THE THREE-DIMENSIONAL CRYSTAL 


There is no difference in principle between a realistic three-dimensional 
crystal and the one-dimensional example which we have been consider- 
ing. However, the detailed evaluation of the various modal frequencies 
is much harder. Results can be obtained in terms of the wave number k, 
which is now a vector with components kz, ky, and kz. The frequency, 
written in terms of these components, is generally very complicated. 
There is more than one solution for each value of k because of the pos- 
sibility of various polarizations (directions of vibration). Furthermore, 
a real crystal often consists not of an array of atoms equally spaced, 
but rather of an array of unit cells, each unit cell consisting of a group 
of atoms in some characteristic geometrical arrangement. If there are 
several atoms (say p) in such a unit cell (and this example can be illus- 
trated with a one-dimensional model), then there are 3p frequencies for 
each value of k. 

In the three-dimensional crystal we can still use the approximation 
of continuity to good advantage. In this approximation the true lattice 
structure of the crystal generally makes itself felt through the existence 
of different properties in different directions (e.g., anisotropic compress- 
ibility). The symmetry of the lattice is reflected by the symmetry of 
the elastic constants. Furthermore, the fundamental modes have vibra- 
tion directions (polarization directions) which are not necessarily either 
parallel to or perpendicular to the direction of propagation of the wave. 

For the present discussion, we shall assume that our substance dis- 
plays the same elastic constants in all directions. (In general, it is not 
necessary for any crystal, even one as symmetric as a cubic crystal, to 
do this.) Then the waves are of two kinds, longitudinal and transverse. 
These two kinds of waves have different wave velocities, which we shall 
label cz, for the longitudinal and cr for the transverse. For each k there 
are three modes. One of these has the frequency wy, = czk (where k is 
the absolute magnitude of k). Since, by hypothesis, there is no direc- 
tional effect, the frequency is a function only of the absolute magnitude 
of the wave number and does not depend upon its specific components. 
There are two transverse modes (i.e., modes in which the direction of 
motion of the atoms is perpendicular to the direction of motion of the 
wave), both of which have the frequency wr = erk. 

Every separate mode, and that includes every separate direction of 
polarization, behaves like an independent oscillator. 
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Suppose we are dealing with a crystal of volume V. Let us compute 
the number of modes whose wave numbers lie in the k-space volume 
element dòk = dk, dk, dk, centered about the point k. We assume 
the crystal is rectangular with edge lengths Lr, Ly, and L}. We use 
the results obtained from the one-dimensional example to see that the 
discrete values of ky are spaced apart a distance 27/D,. So, in the range 
of wave number dk, there are dk, L,/2m discrete values of kz? Applying 
this same reasoning to the other directions, we find that the number of 
discrete values of k included in the interval is 
dk, dk, dk, dèk 
Saa elata = Y a 
This same result is obtained (in the limit of large crystals) for any shape. 

For the general case the modal frequency wk is, as we have mentioned, 
a very complicated function of k with several branches (values for the 
same k), but its determination is a problem of classical physics; then 
the forms of oscillation in the fundamental modes are known, as are the 
normal coordinates describing these modes. The quantum-mechanical 
problem is then reduced to the solution of a simple set of oscillators, 
and all the properties can be worked out easily. The excitation of each 
mode is called the excitation of a phonon. 

As a very simple special example, we shall consider the longitudinal 
modes of oscillation in an isotropic solid (i.e., sound or, in particular, 
longitudinal sound). We can start as we did in the one-dimensional 
example with the atoms in the crystal discretely spaced and later pass 
to the long-wavelength limit, or approximation of continuity. 

A complete solution would show us all the effects of dispersion, the 
complicated branches, and the transverse waves. It is a very interest- 
ing study. However, one need not carry out all of the steps in order to 
obtain the proper quantum-mechanical form of the continuity approx- 
imation. One can make use directly of the results of classical physics. 
The entire procedure, starting with discretely spaced point masses, then 
passing to the long-wavelength limit, is just as useful and just as valid in 
quantum mechanics as it is in classical physics. The lagrangian has the 
same form so long as one restriction is imposed, i.e., that the potential 
can be adequately represented by a quadratic function of the displace- 
ments. The reason for the similarity between the results of the classical 
and quantum-mechanical approach is that the procedure consists only 
of various linear transformations, e.g., transforming to normal coordi- 
nates followed by certain approximations, such as the approximation of 
continuity. These transformations and approximations can be done in 
quantum mechanics exactly as they are done in classical physics. 
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The equations derived from classical physics are as follows. Suppose 
u(r,t) represents the displacement of a particle whose equilibrium po- 
sition is at r. We assume that we are working in the long-wavelength 
region, so that the approximation of continuity applies. A plane-wave 
mode is easiest to describe in terms of the Fourier transform given by 


sis J J [ u(r, ther dr (8.118) 


where r is a spatial vector having the components x, y, z. The normal 
coordinates of the various modes depend on the relationship between 
the direction of U and the direction of the wave vector k. That is, 
the coordinate U,(k,t) of the vector U does not necessarily represent a 
normal mode. For an isotropic material the three modes of a given k 
have the following normal coordinates: 


Uo(k, t) = É .U(k,t) (8.119) 
(that is, the component of U in the direction of k) and 

U,(k,t) = e1-U(k,t) (8.120) 
Uz(k,t) = e2-U(k,t) (8.121) 


where e; and ez are two unit vectors perpendicular to k and perpendic- 
ular to each other. For the present study we shall restrict our attention 
to just that part of the kinetic and potential energy which arises from 
the longitudinal modes given by Eq. (8.119) and omit the transverse 
oscillations. 

Using the results of classical physics, the lagrangian for the longitu- 
dinal modes can be written as 


ESE 


Here we have introduced the speed of sound c = w/k, which is a function 
of the direction of propagation. This is a direct generalization from the 
one-dimensional example. In terms of the original variables u(r, t) the 
lagrangian is 


=£ JJ] (F) -eva d?r (8.123) 


The first term on the left-hand side of this equation is the kinetic en- 
ergy, given by one-half the mass times the square of the velocity. The 
second term is the energy of compression given by V » u, which is the 


d°k 


mE (8.122) 
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compressional strain. No energy of sheer strain is included here because 
we have disregarded transverse elastic waves. 

Variation of the lagrangian with respect to u produces the classical 
equations of motion as 


1 ðu 
c? Ot? 
If we define a compressional strain function equal to the divergence of 
u, that is, 


-V(V-u) (8.124) 


olr, t) = V-u(r,t) (8.125) 
we have the result 

18$ oy 

foe Y? (8.126) 


which is the classical wave equation. 


The Fourier transform of Eq. (8.124), using the kernel e~**'T and 
taking the component of the result parallel to k, gives 
— = = —k*Uo(k,t 127 
2 58 o(k, t) (8.127) 


This is the equation of a single harmonic oscillator, and it shows us that 
Uo(k, t) is indeed a normal coordinate. 

The quantum-mechanical results from the lagrangian given by 
Eq. (8.123) can be obtained easily. The energy levels of the mode in 
question are given by nfi(kc) above the ground level. Let us ask for the 
amplitude to go from a given initial set of coordinates u(r, 0) to a given 
final set of coordinates _ It is 


Klu(r, T), T (8.128) 


fol! ain G 3 -evaa fra) D'u(r, t) 


The path integral of Eq. (8.128) is carried out over the paths u(r, t) 
defined in terms of all three components of the vector r, as well as 
the time t. It is subject, of course, to the condition that the function 
u(r,t) take on a given form at both the initial and final points. This 
is an interesting extension of our original path integral idea. Up to 
now we have dealt with integrands which were functionals of one (or 
perhaps a few) function x(t) of one variable t, and we have carried 
out the integration over all such paths, or functions. Now we must 
integrate a functional of the function u(r, t) of four variables x, y, z, and 
t and carry out the path integration over all values of this function. We 
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can accomplish this by the regular techniques which we have described 
before, for our integrand is still a gaussian functional. 

The first step in the solution of the path integral is to find the path 
which leads to a stationary value for the integral appearing in the expo- 
nent, the one which satisfies Eq. (8.124), or, more conveniently, the wave 
equation given by Eq. (8.126). We must impose the required boundary 
conditions at the times t = 0 and t = T. Satisfying the boundary con- 
ditions is not a difficult problem; however, it is a little different from 
the usual problem in classical physics in which the coordinate and its 
derivative are given at t = 0, that is, u(r,0) and (Ou/0Ot)+—o. 

We could proceed along this line and solve the problem. However, 
we have learned from previous examples that it is much easier to trans- 
form the problem into normal coordinates before carrying out the path 
integral. Such a transformation gives us (using Eq. A.11) 


ae a pl ge 2 2D 2 
K= | a fav Df |? — R220, |2] at $ DU; (ke, t) 


U (0) 
(8.129) 
where the boundary values are given by 
Uo(T) = Uo(k, T) =% / J j ue: eer dir 
(8.130) 


Up(0) =Uo(k,0) = 2 J | Tat ime idt 


This is once more the simpler type of path integral, where the path is 
described in terms of only the one variable t. Since the path integral can 
be written as a product of path integrals, each one defining the motion 
of a normal mode, we find that we have already solved the problem. The 
result is (see Eq. 8.1) 


1/2 
pke tpkc 
ce ee — 
lI (sa sin a) es 2AV sin keT oa 
x ( [U6 (k, T) + U6 (k, 0)] cos keT — 2Uo(k, T)Uo(k, 0) 


In the products over the components of k, the x component, for ex- 
ample, takes on the values 27n,/L,, where nz is an integer running 
from 0 to Ns = L,/d. Here d is the spacing between atoms, and the 
sample under study has edge lengths Lz, Ly, and Lz. Of course, the 
approximation of continuity implies a zero spacing between the atoms, 
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which means that the product is unbounded. However, we shall disre- 
gard such problems and concentrate only on the form of those terms 
showing dependence on initial and final coordinates. Thus, disregarding 
the radical which multiplies the exponential term of Eq. (8.131), we can 
write this equation approximately as | 


ke 
ee E SI sin kcT P 


3 

x ( [Ug (k, T) + Ug (kk, 0)] cos keT — 2U9(k, T)Uo(k, o) ) Ea 

The dependence of the amplitude on the boundary values Uo(k, 0) 

and Uo(k,T) is contained in this last result. For any choice of these 

functions {and they, in turn, depend on u(r, 0) and u(r, T), as shown by 

Eqs. (8.130)] the integration in Eq. (8.132) can be carried out, formally, 

and a final answer obtained. In this manner all questions about the 

quantum-mechanical behavior of the system can be answered, at least 
in principle. 





QUANTUM FIELD THEORY 


Suppose we are dealing with waves or modes which are described by con- 
tinuous functions, like u(r,t), for which there is no atomic substructure 
or for which the aa. are long enough that we can neglect such 
a substructure. In this case we say that u(r,t) is a field, i.e., a function 
of each point in space. In the example we have just considered the field 
is the displacement field of sound. In this terminology the equations of 
motion are called the field equations. In the present chapter we have 
been dealing only with linear field equations. The lagrangians can be 
called the lagrangians for the field. The normal coordinates U(k, t) are 
the coordinates for the normal modes of the field. The description of 
these modes as quantum oscillators is called quantizing the field. The 
resultant theory is called quantum field theory, to distinguish it from 
the classical analysis of the equations. 

As we have seen, almost all of the effort in quantum field theory is 
devoted to solving the classical equations of motion to find the normal 
modes, an activity completely within the realm of classical physics. The 
“quantization” consists then of no more than the additional remark that 
each of the normal modes is a quantum oscillator, with energy levels 
(n+ %4)hw. Presented in this way, quantum field theory seems to be just 
a special consequence of the Schrodinger equation, and not an extra 
theory at all. 
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That is, or should be, the case for any situation in which the field 
variables (like sound displacement or pressure) are defined ultimately 
in terms of some combination of the basic mechanical variables. These 
basic variables describe the position of the particles, atoms, electrons, 
and also nuclei, which comprise the material carrying the field. For 
example, in the case of sound we assume that Schrodinger’s equation 
describes the motion of the constituent parts, or atoms, in a crystal. 
Then we easily deduce that the long-wavelength sound waves obey the 
classical linear field equations, and we find that the modes are quantized. 

In a few cases the classical equations of some field pertaining to 
a system are known, even though the quantum-mechanical derivation 
starting from Schrédinger’s equation has not yet been made. For exam- 
ple, the equations describing the oscillations of a drop of nuclear matter 
have been guessed by classical analogy. In such a situation it is an 
excellent guess that the modes of the field will turn out to be quantized 
oscillators if and when the complete quantum-mechanical derivation is 
worked out. Actually, not many such examples are left. Nearly all cases 
have by now been worked out. 

Another type of field equation, fundamentally different from that de- 
scribed above, exists in quantum mechanics. An example of this type 
is Maxwell’s set of electromagnetic equations, a set of linear field equa- 
tions. These equations lead to a wave equation which is analogous to 
the one we developed for sound, although there are different polarization 
conditions. Just as an organ pipe has standing waves, or modes, so an 
electromagnetic field in a cavity can be described classically in terms of 
fundamental modes of oscillation. It is a natural inference that these 
oscillations are also quantized in the sense that each mode can have the 
energy levels nhw above the ground level, etc. This is the fundamental 
assumption of the quantum theory of electromagnetism. It is not a strict 
deduction from the Schrodinger equation for matter, because the elec- 
tromagnetic field is not understood as a long-wavelength approximation 
of an atomic medium. Today, we do not think of any particular medium, 
but take the equations of Maxwell for granted. We simply assume they 
are to be quantized in the simple direct manner described above. We 
shall discuss this example in more detail in Chap. 9. 

The assumption of quantization for the electromagnetic fields turns 
out to be consistent with all experiments carried out so far, although 
there are some theoretical difficulties. These difficulties are associated 
with the extension of the scheme to modes of very short wavelengths. 
There are various effects which lead to diverging integrals if the integra- 


'M.S. Plesset, On the Classical Model of Nuclear Fission, Am. J. Phys., vol. 9, 
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tions are carried towards zero wavelength. The corresponding difficulties 
do not really arise in a vibrating crystal because if we wish to carry the 
analysis into the very short wavelength region, where the wavelengths 
are comparable to the atomic spacing, we must drop the approximation 
of continuity. Then in the case of a crystal we find that there are only 
a finite number of modes in any finite volume, while in electrodynamics 
the number of modes in any volume is infinite. 

When the various modes of a field are excited, we say there are 
“things” present which have different names for different cases. For 
sound or crystal vibrations we call them phonons, for the electromagnetic 
field photons, for meson field theory mesons, etc. Even electrons can be 
represented as being excitations of a field, but it is a field of a very 
different kind from what we have been discussing. It is called a Fermi 
field; the particles obey the exclusion principle, and the lagrangian is 
quantized not by representing it as a set of harmonic oscillators, but in 
a different way. Fields quantized as modes of harmonic oscillators are 
called Bose particles; they obey Bose or symmetric statistics. This just 
means that if one has two particles, one of wave vector kı and one of ko, 
there is only one state. There is no new state where the first has ky and 
the second has kı. This is because for our field there is only one state 
with kı and kəz each excited to the first level. It has energy hw, + ws, 
and it is meaningless to ask: After an exchange, which excitation is 
which? In the next chapter we discuss this in more detail for the case 
of photons of the electromagnetic field. 


Problem 8-7 It is believed that neutral particles of spin zero (like 
neutral pions) can, when free, be represented by a field ¢(r,t) with a 
lagrangian 


L= F (3) — e(o) + (ue) 4 d?r (8.133) 


where u is some constant. Show that this field has quantized states 
corresponding to waves e'¥", where the energy of excitation is 


hw = /(hke)? + (uc)? (8.134) 


If hk = p is considered as the momentum of each excitation the energy 





is 
E = \/(\ple)? + (ue?)? (8.135) 
This is the relativistic formula for the energy of a particle of momentum 
p and mass u. (Note: For p? small it is approximately 
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the rest energy uc? plus the kinetic energy p?*/2u.) 

We interpret the state of the field when the mode kı is excited to 
the second quantum level, kz to the first, etc., as the state of a system 
containing two particles with momentum fAk,, one with Ake, etc. The 
ground state is considered the state in which no particles are present, 
and it is called the vacuum state. Excitation or deexcitation of the field 
oscillators corresponds to creation or annihilation of particles, and this 
is the way that such processes are represented in relativistic quantum 
field theory. 


THE FORCED HARMONIC OSCILLATOR 


In this chapter we have dealt with the simple harmonic oscillator or 
with systems that could be reduced to a set of such oscillators. But the 
oscillators have been free, not interacting with anything else. We must 
develop our analysis further if we wish to deal with such linear systems 
in interaction with other systems or driven by external forces. Exam- 
ples of such systems include polyatomic molecules in varying external 
fields, colliding polyatomic molecules, crystals through which an elec- 
tron is passing and exciting the oscillator modes, and other interactions 
of the modes with external fields. We shall not discuss the problem of 
interaction in general; instead, we use as a prototype the example of 
the interaction of atomic systems and charges with the electromagnetic 
field. We do this in the next chapter. Other cases may be analyzed by 
direct analogy. 

These problems involve two aspects: (1) the resolution of the field 
into its component independent oscillators and (2) the interaction of 
each oscillator with external potentials or other systems. The resolution 
into oscillators has been exhaustively studied so far in this chapter. 

To prepare the complete machinery for such problems, it remains 
only to analyze the behavior of a single oscillator disturbed by an exter- 
nal potential. We shall put these pieces together in the next chapter. 

In this section we go back to the study of a single harmonic oscillator, 
but coupled linearly to some external potential or disturbance. The 
lagrangian for such a system is given by 

2 
LG et) = Lg — MU g + f(t)z (8.136) 
where f(t) is the external force. We assume for convenience that it is 
turned on only during a certain time interval T from t = 0 to t = T, so 
that the oscillator is free initially at t = 0 and free finally at t = T. In 
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Prob. 3-11 we completely solved this problem, obtaining the amplitude 
K(b,a) that the oscillator goes from point x, at t = 0 to point zz at 
t = T. But for the present applications it is convenient to find as well the 
amplitude Gmn that the oscillator initially in energy state n is found at 
time 7’ in energy state m. This representation is often more convenient 
than the coordinate representation. 

In Sec. 8-1 we determined the wave functions ¢,(«) for the free har- 
monic oscillator, and in Prob. 3-11 we evaluated the kernel describing 
the motion of a forced harmonic oscillator. This means that we can 
determine the amplitude Gmn by direct substitution into 


Ginn = Chl Fmt / J bm(xo)K (zb, T; £a, 0)n(Ta) dLa dzo (8.137) 


For the case m = n = 0 this integral is a gaussian somewhat lengthy 
to evaluate but presenting no special problems. The result® is 


Goo = apd- azi fito jene 5) ds ah (8.138) 


If m and n are not equal to 0, then the integral is somewhat more 
complicated. However, we can use the same sort of trick that we used 
in Prob. 8-1. We shall ask for the amplitude that a forced harmonic 
oscillator goes from the state w to the state X, where these two states 
are defined in Prob. 8-1. This amplitude is (using Eq. 8.28) 


5 3 Ginn, (bjp, (aje 7C P EmT (8.139) 
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If we can work out F (b,a), we can get Gmn through multiplying F by 
exp{(Mw/4h)(b* + a*)} and developing the resulting expression in a 
power series in a and 6. That is, we want first to solve 


Mw 1/2 
F(b, a) = (=) 
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(8.140) 


where K(£»b,T;£a,0) is the kernel for a forced harmonic oscillator, 
Eq. (3.66). The variables appear only quadratically in the exponent 
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of this integrand, so that the integration can be performed easily. Some 
of the resulting algebra is a little lengthy; however, eventually one finds 





F(b,a) = exp a kel + b* = Qabe T] (8.141) 
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where 
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B= JN | fie ai (8.142) 


= Tate sf Ae 
The value of Goo can be obtained easily from Eq. (8.141) by setting 
a=b=0. The result is the same as Eq. (8.138). Next we multiply by 
the exponential function, as described below Eq. (8.139), and find, by 
putting 


Mw (Mw, iwut 
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5 5 Gmn = = Goo exp{ry +162 + iß*y} (8.144) 
m=0 n=0 

By expanding the right-hand side in powers of x and y and comparing 
terms, we obtain the final result 


l 
Goo m! n! 


a/ m'n! 3 (m = rir! (n = rir 


where / is the smaller of m or n. 
This completely solves the problem of a forced harmonic oscillator. 
We shall discuss it further and make use of it in the next chapter. 


Gmn = ria ae)" (8.145) 





Quantum Electrodynamics 


IN this chapter we shall discuss the interaction between charged particles 
and an electromagnetic field. We have seen one example of such an 
interaction in Sec. 7-6, where the electromagnetic field variables entered 
into the potential term of the lagrangian. The electromagnetic term 
introduced in that section is the vector potential A. Section 7-6 deals 
only with the motion in a definite given field. It does not tell us anything 
about how the field A arises or how it is affected by the moving particles. 
That is, the formulation of the problem does not contain any analysis 
of the dynamics of the field. Such an approach, using given potentials, 
is only an approximation. It is valid when these potentials arise from 
such large pieces of apparatus that the motion of the particle does not 
affect the potential. 

In this chapter we shall be concerned not only with the way in which 
potentials affect the motion of the particle, but also with the way in 
which the particle affects the potentials. We shall start with the classical 
approach and use Maxwell’s equations to describe the dynamics of the 
electromagnetic field. These equations express the field in terms of the 
charge and current density of the matter present. 

We have found in preceding chapters that the quantum-mechanical 
laws which correspond to some classical system can be easily determined 
if only we can express the classical laws in the form of a least-action 
principle. Thus we have found that if the extremum of some action 
S, varied with respect to some variable x, corresponds to the classical 
equation of motion, then the quantum-mechanical laws are expressed 
as follows: The quantum-mechanical amplitude for any given situation, 
corresponding to the action S, is the path integral of e*°/" integrated 
over all possible paths of the variable x which fit the conditions of the 
situation. 

It is vital to our present approach that classical electrodynamics, 
as expressed by Maxwell’s equations, can be written as a principle of 
least action. An action S exists which can be expressed in terms of 
the vector and scalar potentials A and ¢. The determination of an 
extremum for this action, by variation of the field variables A(r,t) and 
o(r,t), leads to a formulation of electrodynamics equivalent to Maxwell’s 
equations. Hence, quantum electrodynamics results from the rule that 
the amplitude for an event is 


b 

K(b,a) = | et IA.Ol/h DA (r, t) Dele, t) (9.1) 
Q 

where the path integral is over all values of A and @ at each point of 


space and time, subject to the boundary conditions at the initial and 
final points of the event (cf. Eq. 8.128). 
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CLASSICAL ELECTRODYNAMICS 


Maxwell’s Equations. We shall begin our study of electrodynam- 
ics from the customary classical fundamentals, i.e., from Maxwell’s equa- 
tions. We shall assume the magnetic permeability and dielectric constant 
are those for free space. ‘Then, with E as the electric field vector, B as 
the magnetic field vector, c as the speed of light, p as the charge density, 
and j as the current density, Maxwell’s equations are 


V-E=479 (9.2) 
V-B=0 (9.3) 
1 OB 
E = —-— „4 
Vv x te (9.4) 
1 / 0E 
B = =- | — + 4rj 
V x “(Ft ri] (9.5) 
These equations make sense only if charge is conserved, that is, 
, Op 
VS 6 
me (9.6) 
Equation (9.3) implies that B is the curl of some vector A: 
B=VxA (9.7) 


This relation does not fully determine A; we still may specify its diver- 
gence. We choose 


V-A=0 (9.8) 


This choice is not recommended if it is desired to keep the full relativistic 
four-dimensional symmetry of the equations in evidence. (It is not that 
the results using Eq. (9.8) are not relativistically invariant; for the results 
are independent of the choice of V- A. It is, rather, that the invariance 
does not appear obvious at first glance.) In our case we shall deal with 
matter in the nonrelativistic approximation anyway (for we do not have 
a simple path integral for the Dirac equation). We wish to illustrate 
the properties of the quantized electromagnetic field, and the results are 
least cumbersome with the choice of Eq. (9.8). 

Substitution into Eq. (9.4) shows that E+(1/c) 0A/0t has zero curl, 
so it must be the gradient of some scalar potential 

1OA 


Beove~ (9.9) 
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From Eqs. (9.2), (9.8), and (9.9) we see that 
V-E=—V°*¢ =4np (9.10) 


If there is no charge and no current density, the equations are easily 
solved. In Eq. (9.10) p = 0, so ọ = 0 and E = —(1/c)0A/0dt. In 
Eq. (9.5) with j = 0 this gives [note: V x (V x A) = V(V-A)—-V°*A] 


1 07A 
c2 Ot? (A51 
Thus, each component of A satisfies a wave equation. 
If we assume A is a running plain wave, that is, 
A (r,t) = a,(t)e™*? (9.12) 
the equation for the amplitude a, is A, = —k*c*a,, which implies that 


a, is a simple harmonic oscillator with frequency w = kc for each compo- 
nent direction of a. Actually there are only two independent transverse 
waves; the component of a, in the direction of k must be zero. This is 
the implication of Eq. (9.8), which can be rewritten as 


kak = (9.13) 
Thus the field in free space is equivalent to a set of free harmonic oscil- 


lators with two transverse waves for each value of k. 


Problem 9-1 Show that E, B, and k are mutually perpendicular 
for this plane-wave solution. 


Solution with Charges and Currents Present. We shall expand 
the solutions for A, @, and the current and charge density in plane waves, 
writing 





3 
Birt = ire | ax(te" 
N= | ott Ome 7 (9.14) 
f der ` : 
j(r,t) = v=] jelt) s 
wst k 
o= | oe Se 


Problem 9-2 Explain why the charge density corresponding to a 
single charge e located at the point x(t) = (x(t), y(t), z(t)) at time t is 


p(r,t) = ed(rz — x(t))4(ry — y(t))d(rz — z(t)) = e8? (r — x(t) 
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Show that 
px(t) = ee kx) (9.15) 


Explain why the current density is j(r, t) = ex(t)d°(r—x(t)). If we have 
a number of charges e; located at x;(t), the values of pp and jk are 


Pk = So e;e Kh X(t) ik= ` e;X; He 22 (9.16) 
i i 
If the expansions of E and B are 


then, using Eqs. (9.9) and (9.7), the expansion coefficients satisfy 
Ek = —ikøk — y ÅT ak and Bk = V4r ci(k x ak) 


From Eqs. (9.8) and (9.10), the coefficient of expansion of V-E is 
ik-E, = k*¢,, so we have 





hk by = AM x (9.17) 


or dy = 4rpk/k?. The function k is completely determined in terms 
of the charge density px; there are no dynamic differential equations to 
solve, involving, for example, @ x. 


Problem 9-3 Prove that the relation ¢, = 47p,/k* simply means 
that d, at any instant is the Coulomb potential from the charges at that 
instant, so that, for example, if p comes from a number of charges e; at 
distances R; from a point, the potential at the point is ọ = $}; e;/Ri. 
This is just the content of Eq. (9.10). 


Equation (9.5) still remains to be solved, that is, 
ik x BE = “Ky, -+ “Ahh (9.18) 
But (using k-ap = 0) 
ik x Bk = —V4a ck x (k x ax) = V40 ck? ay, 


and Ep = —~ikdy — /47 &,, and using Eq. (9.17) to express k as 
Arp, /k*, we get 


7” 
äk + ka, = Van (ix a Ter) = Var ji (9.19) 
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where we can call jj, = jx —ikp,/k* the transverse part of jk. The law of 





conservation of current, expressed by Eq. (9.6), says that pk = —tk-Jx, 
SO 
vy «KK x 


which means that j, is jx less its component in the direction of k. 
Clearly, kjp = 0. 

We have certainly reduced Maxwell’s equations to a very simple form 
— aside from the instantaneous Coulomb interaction between particles, 
we have no more than the equations for two transverse waves for each 
value of k, the amplitude of each being a harmonic oscillator driven by 
the component of current in the corresponding direction. That is, if 
we choose two directions perpendicular to k, say 1 and 2, and call the 
components of a, in these directions a k and a2, k, Maxwell’s equations 
are 


1, ko kc? Qik = = 4/4 TJI, k (9.21) 
Gok + k’c azk = VAT j2 k (9.22) 


where ji k and jo, are the components of jk in these directions. (Why 
do we not need to say “of jj,.”?) 


The Least-action Principle. The hypothesis of quantum elec- 
trodynamics? is that the oscillators defined in Eqs. (9.21) and (9.22) are 
quantum oscillators. To carry out the quantization, we must find the 
principle of least action which gives these defining equations of motion 
as well as the equations of motion of the particles in the field. The action 
is 
S = S1 + Sa + $3 (9.23) 


where 
D Ti f pal dt (9.24) 


is the action of all the particles, disregarding the field (if there are non- 
electric forces between the particles, they are to be included in 61), 


52 = - ff lowe = ) — Žil r,t) Att) d°x dt 
-5e | a “a (t)-ACx(), 0 dt (9.25) 


lIt should be pointed out here that some physicists use the term “quantum elec- 
trodynamics” to include electron-positron pair theory. In the present chapter we do 
not cover such problems. So for us, quantum electrodynamics means the quantum 
theory of the electromagnetic field. 
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is the action of interaction of field and particles, and 


seic - J fie — B*) d rul 
Sat 
-fJ -v ot 


is the action of the field. The variables are A(r,t), d(r,t), and x;(t). 











-|V x ar dr dt (9.26) 


Problem 9-4 In Sec. 2-1 we discussed the mechanisms for obtaining 
the mechanical equations of motion from the form of the action S by 
obtaining the extremum S,; under the condition 6S = 0 for variations 
of the coordinates, 6x. Show how Maxwell’s equations can be derived 
from the action S defined in Eq. (9.23) by requiring 6S = 0 for first-order 
variations of A and @. 


Since the dynamic equations are simplest in terms of the variables 
ak, 1t is worthwhile to express the action in these variables. Substitution 
of the expansion given in Eqs. (9.14) into S's gives 

















S3 =~ s/f a 5 =| -akxa SES 
SORE g 3k dt 
= s/f lout? inthe nye (9.27) 


and Sy becomes 
d°k dt 
— JJ pi bt = vīr jax nye (9.28) 


Upon substitution of ¢, = 4rp,/k*, the terms in @, in S2 and S3 add 
to give 


Z 3 
__4n ffe pkp- ne = TDR T (9.29) 
iia | 


using Eqs. (9.16), since f(4r/k?)e™®" d?k/(27)? = 1/r. This is just the 
Coulomb interaction between the charges, which is usually considered 
in analyzing atoms when electromagnetic radiation effects are neglected. 
That is, we shall include this interaction in the action Smat of the matter, 


Smat = $1 + Se = PAG aid +s pe a (9.30) 
i j 
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and write S = Smat + Sint + Sraa. We have thus divided the action 
of the electromagnetic field Sy into two parts; one contributes to an 
instantaneous Coulomb interaction, and the remainder we shall call the 
radiation field Sraa. (The radiation field takes care of all corrections to 
the instantaneous field, such as, for example, that the total effects are 
retarded and act no faster than the speed of light.) The action of the 
radiation field is S3 less the terms involving k. That is, 


1 as ; oa : d°k dt 
Srad = 5 Gear = k’ at kak T A2 42k — kc" a3 kaz k) (27)3 
(9.31) 


which is just the action of the radiation oscillators. The action of inter- 
actions of these oscillators with the particles is 


— | | d°k dt 
Sint = VT (j1,-k@1,k te j2,-k@2,k) (2r)? (9.32) 
Clearly, the variation of the total action of S with respect to the ai k 
and a2 k gives the equations of motion (9.21) and (9.22). 
Written more explicitly, the action Sint is 
Sg VA 3 iat Tk at 9.33 
int = V 4T 2°) (21j01,k + £2702,k)e ans (9.33) 
where £1; and £2; are the components of x; in the direction transverse to 
k. Thus all the laws of nonrelativistic mechanics and of electrodynamics 
are contained in the proposition that S, the sum of Eqs. (9.30), (9.31), 
and (9.33), is stationary for variations in the paths of the variables x, (t), 
a1 k(t), and azx(t). Quantum electrodynamics results from integrating 
etS/R over these paths, and it is described in the next section. 


Problem 9-5 The momentum in the field is given by 


1 
a | Bx Bas 
ATC 


In the absence of matter (so ¢, = 0), show this is i f k(af-a,) d°k/(27)°. 


THE QUANTUM MECHANICS OF THE RADIATION FIELD 


We begin by discussing the quantum mechanics of the radiation field in 
empty space. There is no matter present, so the total action is that of 
the radiation field alone 


S = Stag (9.34) 
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as given by Eq. (9.31). It is evidently the action of a set of harmonic 
oscillators. We have seen some examples of expressions like Eq. (9.31) in 
Chap. 8. We make the assumption discussed in Sec. 8-8 that quantum 
electrodynamics results from considering these as quantum-mechanical 
oscillators. 

The modes of our system are running waves, two for each value of 
k (polarization 1 and 2) with frequency w = kc. For one of the modes, 
say @1 k, the available energy levels are 


Fik = (Mk + 5 )hke (9.35) 


where ni k is any positive integer or zero. 

If ni k = 1, we say there is one photon present of polarization 1 and 
momentum Ak; in general, we say nı k such photons are present. The 
energy of a single photon of this kind is Akc. 

Later on when we consider the interactions of matter with the radi- 
ation field, we shall find that the matter absorbs or emits one photon 
at a time of energy hw. Of course, this is the same as Planck’s original 
hypothesis. 

It is quite striking and surprising that the states n of the oscillators 
can also be described by imagining that there are n “particles” or “pho- 
tons” present. It is clear, of course, that the energy values agree. But 
there is one further subtle point that must be noted before the oscillator 
states can be completely successfully described as particles. Suppose, 
for example, that just two of the n; differ from zero, say, Ng = 1, ne = 1. 
This single state we may wish to represent by saying that we have one 
photon in level a and another in level b. But at first sight this way of 
speaking might seem to imply that there were two states available, both 
of the same energy. For we could also expect to be able to put the first 
photon in level 6 and the second in level a. The way out of this can be 
seen when we consider the example of alpha particles. Suppose we have 
two alpha particles with coordinates x and y, and say the x particle is 
in a level represented by f(x) and the y particle is in a level g(y). Thus 
the wave function for the system would be 


(x,y) = f(z)g(y) | (9.36) 


a function of the two variables xz, y. But another state might have y in 
the level f and z in g, leading to another state of wave function 


(x,y) = g(x) fly) (9.37) 


which differs from the first. But if the particles are truly identical, like 
alpha particles, the two states are indistinguishable. As we described in 


9-3 
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Sec. 1-3, it turns out to be a rule of quantum mechanics (not derivable 
from the Schrödinger equation) that for alpha particles the amplitudes 
for two cases which differ only by exchange of the alpha particles must 
always be added. The only allowed wave function is in this case 


p(x, y) = f(z)g(y) + glx) fly) (9.38) 


(suitably normalized: if f and g are orthonormal, the factor is 1/ V2: if 
f = g and they are normalized, it is 1/2). In general w(x, y) = W(y, x) 
for alpha particles and for other particles obeying Bose statistics. There 
is, for such particles, only one state: one particle in level f, the other in 
level g. 

It turns out that all the results are consistent if, when we consider 
oscillator excitation states as representing numbers of photons, we also 
say that photons are Bose particles. Then the single state na = 1, na = 1 
represents the situation that there are two photons, one in a, one in b. 
Exchange does not produce a new state. 

For electrons of parallel spin or other Fermi particles we must sub- 
tract the amplitudes when the identity of the particles is reversed. 


w(x, y) = f(x)g(y) — g(x) fly) (9.39) 


The wave function ~(z,y) = —W(y,2) is antisymmetric in general for 
Fermi particles. This is, of course, also just one state. But for Fermi 
particles, two identical particles cannot occupy the same level. If we 
put f = g into Eq. (9.39), we get zero. Two photons, like two alpha 
particles, can occupy the same level; for photons it corresponds to the 
n = 2 oscillator levels. 

There is one particular situation with matter present which, in the 
ideal case, can be handled nearly as simply as the matter-free case. That 
is the case of a cavity resonator (or a wave guide) where the walls may be 
idealized as perfect conductors. Then classically, as is well known, there 
are a number of possible oscillator modes with more or less complicated 
distributions of electric fields. The classical action is then reducible to a 
set of free oscillators, but the variables now represent the amplitudes of 
the various modes, rather than the amplitudes of plane running waves. 
These oscillators are then analyzed as quantum oscillators, and we speak 
of the number of photons in each mode. 


THE GROUND STATE 


Vacuum Energy. The state of the electromagnetic field of lowest 
possible energy, which we shall call the ground state or the vacuum state, 
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is that in which there are no photons in any mode. This means that 
the energy in each mode is hw/2, where w is the frequency of the mode. 
Now if we were to sum this ground-state energy over all of the infinite 
number of possible modes of ever-increasing frequency which exist even 
for a finite box, the answer would be infinity. This is the first symptom 
of the difficulties which beset quantum electrodynamics. 

In the present case, for the vacuum state, the trouble is easily fixed. 
Suppose we choose to measure energy from a different zero point. Since 
there is no physical effect resulting from a constant energy, the result of 
any experiment we perform will be insensitive to the arbitrary choice of 
the zero point in energy. Therefore, we assign to the vacuum state the 
energy zero. Then the total energy in any state of the electromagnetic 
field is given by | | 


i ` nj hw, (9.40) 
j 


where the sum is taken over all the modes 7 of the field. 

Unfortunately, it is really not true that the zero point of energy can 
be assigned completely arbitrarily. Energy is equivalent to mass, and 
mass has a gravitational effect. Even light has a gravitational effect, for 
light is deflected by the sun. So, if the law that action equals reaction 
has qualitative validity, then the sun must be attracted by the light. 
This means that a photon of energy hw has a gravity-producing effect, 
and the question is: Does the ground-state energy term hw/2 also have 
an effect? The question stated physically is: Does a vacuum act like a 
uniform density of mass in producing a gravitational field? 

Since most of the space is a vacuum, any effect of the vacuum-state 
energy of the electromagnetic field would be large. We can estimate its 
magnitude. First, it should be pointed out that some other infinities 
occurring in quantum-electrodynamic problems are avoided by a par- 
ticular assumption called the cutoff rule. This rule states that those 
modes having very high frequencies (short wavelengths) are to be ex- 
cluded from consideration. The rule is justified on the grounds that 
we have no evidence that the laws of electrodynamics are obeyed for 
wavelengths shorter than any which have yet been observed. In fact, 
there is a good reason to believe that the laws cannot be extended to 
the short-wavelength region. Mathematical representations which work 
quite well at longer wavelengths lead to divergences if extended into the 
short-wavelength region. The wavelengths in question are of the order 
of the Compton wavelength of the proton; 1/27 times this wavelength 
is h/myc = 2 x 10714 cm. 

For our present estimate suppose we carry out sums over wave num- 
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bers only up to the limiting value kmax = Mpc/h. Approximating the 
sum over levels by an integral, we have, for the vacuum-state energy per 
unit volume, 


Eo | [~ kerk? dk  ħekf ax (9.41) 
unit vol Jo 2 (2) — 8r? 


(Note the first factor of 2, for there are two modes for each k.) The 
equivalent mass of this energy is obtained by dividing the result by c’. 
This gives 

TMQ 


i OS 1 3 9.42 
unit vol 8/ a ( ) 


Such a mass density would, at first sight at least, be expected to 
produce very large gravitational effects which are not observed. It is 
possible that we are calculating in a naive manner, and, if all of the 
consequences of the general theory of relativity (such as the gravitational 
effects produced by the large stresses implied here) were included, the 
effects might cancel out; but nobody has worked all this out. It is 
possible that some cutoff procedure that not only yields a finite energy 
density for the vacuum state but also provides relativistic invariance may 
be found. The implications of such a result are at present completely 
unknown. 

For the present we are safe in assigning the value zero to the vacuum- 
state energy density. Up to the present time no experiments that would 
contradict this assumption have been performed. As we progress further 
into the field of quantum electrodynamics we shall find other divergent 
integrals which are more difficult to circumvent. 


Vacuum Wave Function. The wave function for the set of oscil- 
lators is just the product of the wave functions for each mode. For the 
ground state the wave function of the oscillator 1, k is (see Eq. 8.83) 
proportional to exp{—(ke/2h)ay kā k}, where 


Qk = @1 k/V Vol 


and “Vol” represents the volume of the normalizing box (see Sec. 4-3). 
Thus the wave function for the entire system in the ground state, or 
vacuum state, is, within a normalization constant, 


| oe ee 
Po = exp [- > A (Gy ey 1 + sista) (9.43) 
k 
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Problem 9-6 Show, using sine and cosine modes and real vari- 
ables, that this expression using complex variables is indeed correct 
(cf. Prob. 8-4). 


= Problem 9-7 Show, for the vacuum state, the expectation value 
of âf kãi,q is (ñ/2kc)ôk,q and that of â; kã q is (h/2kc)d_x,q. Develop 
a formula for the expectation of (@7,,@,;)" for integral r and explain 
thereby how the expectation of such quantities as (ai kā k) (Gl qia). 
can be got for q # k. Show that the expectation of (@ p)? or (@},,)? 
vanishes. Show that the expectation of the product of any odd number 
of @’s is zero and that you can compute the expectation value of any 
product of @’s or a*’s for the vacuum state. 


Problem 9-8 For the state for which there is just one photon 
present in level 1, k, all of the factors in the wave function are o except 
one, which is ¢,. But for an oscillator ¢,(z) = V2 xd9(z). The wave 
function representing an excited running wave is a linear superposition 
of the state with the cosine mode excited and ¿ times the state with 
the sine wave excited, so show that the unnormalized wave function for 
just one photon present in level 1, k is âi yo. The normalization? is 
J $a, käi kodā, or the expectation of @, kā] x for the vacuum, which 
we have seen in the preceding problem is fi/2kc. Hence the normalized 
one-photon state is /2kc/h aj 1, ®o. 


INTERACTION OF FIELD AND MATTER 


To deal with the interaction of the radiation field with matter is 
not difficult in a formal way. Evidently from the action expression of 
Eqs. (9.30), (9.31), and (9.33) we see we must deal with the matter 
system interacting with the radiation oscillators and must calculate am- 
plitudes from 


Amplitude = TI exp Een + Sint + Sina) [| Dx; Dai k Daz, 
1,k | 
(9.44) 


The coordinates of the radiation oscillators can be integrated out imme- 
diately; for they appear only in quadratic expressions. We shall do this 
integration in the next subsection (starting at Eq. 9.60). 
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Emission from an Atom. Part of the complication in this problem 
is simply the confusion produced by so many coordinates and states. So 
we shall begin by dealing with a simple problem just to get more used 
to what is involved. We shall solve the problem of the probability of 
emission of light by a single atom, using perturbation theory (assuming 
the interaction Si, of light and matter is small and expanding it only 
to the first order). 

If Sint is neglected, the radiation and matter are independent sys- 
tems. Let the states of the atom alone have energy Ey for various values 
of N with wave functions Yy (x), where x represents the x; of all the 
particles of the atom. The state of the radiation can be defined by giv- 
ing the values of all the integers nı k and no,. The energy levels of the 
combined system are 


E = Ey +S (nik +nou)hke (9.45) 
k 


The wave function for this state is a product 


Ų = Wy (X) (Ni k, N2 k) (9.46) 


where ®(n1 k, n2,k) is the wave function for the radiation field (a product 
of harmonic oscillator wave functions). 

To deal with atomic radiation of a photon, we consider as the initial 
state that the atom is in some level M and no photons are present (all 
Ni k and no, equal 0). This wave function is 


Va = Ym (x)®o (9.47) 


with o from Eq. (9.43). In the final state the atom is in another level 
N, but now a photon is present, say of momentum fq and polariza- 
tion 1. According to Prob. 9-8 the wave function of the radiation alone 
is proportional to @; ,®o; the complete final wave function is 


VY, = wn (x) E 41g Po (9.48) 

Now to find the transition probability per second (to first order) we 
see, according to Eq. (6.79), we shall need the matrix element Vpa of the 
perturbation potential between these states. The perturbation action is 
Sint as defined in Eq. (9.32), and the corresponding potential is 


yar X it, —K a1 x (9.49) 
k 


9-4 Interaction of field and matter 249 


where ji k = ji,k/V Vol depends on the atomic variables, as in Prob. 9-2. 
This matrix element is 


| 4 POLO wpe. - 8 _ l 
Voa = - | foxy Diq VAT > Hi, -kã cb Po dx | | day. (9.50) 
k 
STC eos rt = ue 
en i, D | vii bm dx | Dia aðr xo T da, w (9.51) 
k 


because only the currents 7 depend on x. The expectation values of the 
a’s for the vacuum state were worked out in Prob. 9-7, that is, 


* 7y 7 ——ae 
| i aði xo lI dü w =0 


unless k = —q, in which case it is h/2qc. Let us write the matrix element 
f wy jvm dx as (j)ym. Our matrix element is therefore 


Voa = EN 2rh/qc (Jia) NM: 


The probability of transition per second is then (see Eq. 6.94) 


27 27h \ - 
(FE) (FE) Pualimô(En - Em + hae) (9.52) 


qc 





Ordinarily we are not interested in the problem of exciting one par- 
ticular photon but would rather see the probability of emission of any 
photon (of polarization 1) into some small solid angle dQ. We must 
sum q over all values which correspond to this direction. The number 
of values of q per unit volume is d°q/(27)?, or if q is in the specified 
direction, we require the integral of g* dq dQ /(27)?, so that we find that 
the probability of a transition per second is — 





dP OT Pe: dq dQ 

dt = J P LinalReaS(EN — Eu + hige)q° (Q7)3 (9.53) 
The integral on q gives 

dP oe as 

Th Ones Sralnar dQ (9.54) 





for the rate of emission of light of polarization 1 in direction q into the 
solid angle dQ. The frequency emitted satisfies 
Em — En 


- (9.55) 


w= ie= 


Problem 9-9 For a complicated system moving nonrelativistically 


(jie = > (eerk ™) wa (9.56) 


a 
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where e4 is a unit vector in the direction of the polarization of the light 
and e; and x; are the charge and position of the ith particle. Assume 
the wavelength of the light is very large compared with the size of the 
atom, i.e., that the absolute square of the wave function describing the 
position of the ith electron falls to 0 over a distance small compared with 
1/k. Show that we can then approximate e~***** by unity and write the 
matrix element as 


(J1k)NM = Wer HNM (9.57) 
where 
Uym = > _(exi)nm (9.58) 


The function yyy is called the matriz element of the electric dipole 
moment of the atom, and the approximation used to derive Eq. (9.57) 
is called the dipole approximation. Show that the total probability to 
emit light in any direction per unit time is 


dP 4w’ | | 
a spal UNM (9.59) 
(Integrate Eq. (9.54) over all directions, remembering that e1 is perpen- 


dicular to k and that there are two possible directions of polarization.) 


Elimination of Electromagnetic Field Variables. Since the 
radiation field is represented by a quadratic action functional, we can 
integrate out all its coordinates. We shall do so here. We must integrate 
all the variables a1 k, @2,k in Eq. (9.44). We must specify the initial and 
final states of the radiation field. First we shall take the simplest case 
that initially and finally we have a vacuum, the oscillators all going from 
0 to 0 photon number. Our amplitude can be written 


Amplitude = J eP Dx; (9.60) 

where 

X (xi) = H e(t/h) (Sint t Sra) TT Day k Daz k (9.61) 
k 


is a function of the x;’s which appear on the right-hand side of the equa- 
tion in the current variables 7. Since the action is a sum of contributions 
> KO S1,k + S2,k) from each mode, where 


Sik = J [VFG xän x + jikik) (9.62) 


ka os al hke 
ae D LKR CET a dt 
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clearly X is a product of corresponding factors. The integral for one 
typical mode, 


a i i F S er aies 
Xk = few}: | acre + Ji kõik) 


i i Ake _ 
a 544 ay her OL ke a it} Da1,k 


th pts 
=e} pas f ariile as k (9.63) 


is a type of path integral which we have already done many times, ex- 
cept for the complication of complex variables, which can first be re- 
duced to real variables. In fact, this is exactly the problem discussed in 
Sec. 8-9. The interaction function f(t) of Eq. (8.136) is here related to 
VT jix(t), and w = ke. The final expression of Eq. (9.63) is equivalent 
to Eq. (8.138). The product of such factors for each k and polarization 
gives X = e¥!/h where 


~ T l ps ikelta 
=i} ELL OAOE SMO NO AE T. 


(9.64) 


Thus the problem of a vacuum-to-vacuum transition is completely 
solved in terms of a path integral over the matter variables alone: 


Amplitude = pamir] DX (9.65) 


We shall discuss a number of consequences of this result. (The case that 
the initial or final state is not a vacuum is described in Sec. 9-7.) 

It appears that the net result is simply this: The matter acts not 
with Smat but with a modified action Sihat = Smat +7. The modification 
results from a reaction with the electromagnetic field. This is not true in 
a strictly classical sense, for the action J is a complex number. It can be 
shown that the classical physics which results from using the principle 
of least action, with the real part of Shat only, is exactly equivalent to 
the combination of Maxwell’s equations and Newton’s laws. But it does 
not correspond to the case that Maxwell’s equations are solved by using 
just retarded waves. (In fact, a restriction to retarded waves cannot 
be represented by any principle of least action in which only matter 
coordinates appear. Instead it corresponds to using half the advanced 
and half the retarded solution.) Our full quantum-mechanical complex 
expression for J is correct, and we shall now look at its consequences. 


1J,A. Wheeler and R.P. Feynman, Interaction with the Absorber as the Mecha- 
nism of Radiation, Rev. Mod. Phys., vol. 17, pp. 157-181, 1945. 
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First-order Perturbation Expansion. The integral over the x’s 
is too complicated to do exactly, but in the expression for the currents 
in J the charge e of the particles occurs. Thus J is proportional to 
e*, which in dimensionless form, for the electron’s charge, is the fine- 
structure constant 
e? _ l 
he 187.039 
a small number whose exact value and meaning are unknown except 
experimentally. ‘Thus we may expect that the effect of J is small. We 
already know that the Schrödinger theory gives atomic levels, for exam- 
ple, quite accurately. There can be only small errors arising from the 
neglect of J. Let us loòk at the effect of I in first order in eĉ, corre- 
sponding to second order in e on the original action of Eq. (9.32). Let 
us take the transition amplitude Amm as defined in Sec. 6-5, where the 
matter system begins and ends in state M. If I is neglected, the zero 
order is 


0 
Amm =e UM EMP (9.66) 


The first-order term is (where x represents all of the x; variables) 


i i ee . | 
Aum = rd, he aii (xa) Dx(t) 
= 5 > | m (Xb) JeC/A) Smat 


ty ar 
xin y JarO File) + Jarje] ds dt 


x Ym (Xa) Dx(t) (9.67) 


Now terminate the integral on s at t, and double the result.° The evalu- 
ation of a similar expression was worked out in Sec. 5-1. For the present 
example, with large values of T, we get 


Man = —>(AE)Te“W/MEmT 
where — 


27 = re a ae 
AE =-— > 2 lO) en Gt) T (J2 k)MN (J2 k) NM] 
N k 


x J e(t/h) (Eu —En —hke)r dr 


— p? [= 2r (racdnml” + |Gox)vm|" d'k 
Em — En —fike+ie (27)8 





(9.68) 
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This has a real and imaginary part and can be written as 
A 
AB = 5B —- = 
The real part ôE represents a small shift in the energy levels of the 
atom, called the Lamb shift. Such a shift was discovered experimentally 
by Lamb and Retherford. This is | 


SE = of Fell [idana]? + |x) |*] 


1 dèk 
P.P. | m ] — l 
ĝ E -E E) (27)? (9 69) 


and the imaginary part is 
ME S Gi gar? + Ganda? 
N 


d°k 
(27)? 

The amplitude that the atom remains in the upper state with no 
photons emitted goes as exp{—(i/ħ)(Em +6E —iħy/2)T} and the prob- 
ability as e~ 77. That is, the probability to remain in state M decreases 
exponentially with the decay rate y. Physically it should decrease þe- 
cause the atom in state M can emit a photon and fall to a lower state 
N. Comparison with Eq. (9.53) shows that in Eq. (9.70) is indeed the 
total rate of transition from state M to all lower states NV. 


X nôl Em = Ein 53 ħke) (9.70) 


A SINGLE ELECTRON IN A RADIATIVE FIELD 


The Energy Correction. In order to study the electromagnetic 
energy correction E, we shall consider the simplest case: that in which 
the matter system has only one moving charge (e.g., a hydrogen atom 
with an infinitely heavy nucleus or a free electron in empty space) whose 
coordinates we shall call x. Thus jk = exe **'*. We have here a case 
where ję contains x, and in considering second-order terms we must take 
appropriate care, as discussed in Sec. 7-3. There is an additional term 
to 6E from the squared velocity term x*. Expressing x in terms of the 
en operator p, as in Sec. 7-5, we obtain | 


spa ©. ool [pie] y + [poem a d°k 
m2 Em a Ein — Ake (27) 


3 
+o [me d - (9.71) 
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Problem 9-10 Why do we not need to be careful to write 
Lipje *™ + ey; | in the matrix elements? 


Let us take the simplest case of a free electron at rest. Any 6ER we 
get for energy in the field will represent a correction to the rest energy, or 
as can be shown from the relativity theory, to the mass, 6m = 6ER/c’. 
This is the so-called electromagnetic mass correction. For a free particle 
at rest, the states are plane waves. If the momenta in M and N are py 
and py, the matrix element pie =") NM is zero unless py = pm —hk, 
in which case it is piy. Thus for an electron at rest initially, the matrix 
element is 0 and Ep is just the last integral of Eq. (9.71), which is 
infinite! 


Difficulties at Short Wavelengths. This in not the whole of it. 
When, at Eq. (9.29), we eliminated the term 4Tpkp-k/k° in Se, we 
pointed out that this represented the interaction between point charges 


73 2 ET | 


but sailed to point out that the infinite terms 1 = j must also be 
included in the sum. Indeed, for a single particle pp = ee~**™*, so 
4r|pok|?/k? = 4re?/k? and the term is 


1 d°k 
bbe = 
= Are’ | iir OAE 


The infinities here and in ôEpg above do not cancel, and we are left with 
a real difficulty; our integrals over momentum k diverge quadratically. 
Quantum electrodynamics gives nonsensical results. 

It is true that we are using a nonrelativistic treatment of the charged 
particle. The relativistic treatment of the matter (quantum electrody- 
namics is not altered) does not rid us of the divergent results, although 
the order of infinity may be changed. For a particle of spin 0, like a 7 me- 
son, the order is unchanged; it is still a quadratic divergence. Here there 
is presumably an experimental value of the mass correction available. As 
far as is known, through other interactions, the sole difference between 
charged and neutral 7 mesons is the charge, i.e., the different way they 
couple to the electromagnetic field. So presumably the mass difference 
of the charged 7 meson with a mass m, of 273.2 electron masses and the 
neutral 7 meson of 264.2 electron masses, that is 9.0 electron masses, or 
0.034m,,, or 4.6 MeV, represents energy in the electromagnetic field. 
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If we arbitrarily stop our integrals at some higher momentum kmax 
(which is not a relativistically invariant procedure), we get an energy 
he*(Kmax)°/27m,c from the last term of Eq. (9.71), which is the largest 
term if ikmax/c is very much larger than the m meson mass ™m,. If this 
equals 0.034m,,c*, then (e?/2rħe)(kmax/Mre)? = 0.034, or 


D.4m,c  0.8Mpec 

Ao 
where Mp is the mass of a proton. (The relativistic theory gives 
AE = 0.034m,c? with a cutoff at about the same energy.) It is for 
this reason that we conclude that our present-day formulation of quan- 
tum electrodynamics (or of the “particles” with which photons interact) 
is faulty. The fault lies in the way we deal with energies beyond proton 
mass or with corresponding frequencies, or wave numbers. The difficult 
arises with modes whose wavelength is less than about 4r x 107!4 cm. 

For the electron of spin t the Dirac theory shows that the electron 
should have a certain magnetic moment. It turns out that with such a 
magnetic moment the negative magnetic energy almost perfectly cancels 
the positive electric energy. The difference still diverges, although only 
logarithmically. If a cutoff is applied to integrals over wavelengths, at the 
wavelength limit suggested above, the correction to the electron mass is 
only about 3 per cent, but there is no way to test this, for we do not 
recognize a neutral counterpart to the electron. 

For the proton the anomalous magnetic moment is so high that the 
magnetic energy exceeds the electric energy and the correction can be 
negative. The neutron is a magnet, too, so its correction is also negative. 
Since the proton moment is higher, the fact that the neutron is heavier 
than the proton might be explained. If the integrals are cut off at an 
energy of the order of a proton mass, the difference comes out correctly, 
but this is too crude a way to calculate such an accurately known energy 
as the 782.61 +0.40 keV? equivalent to this mass difference. These mass 
differences (of proton and neutron, of charged and neutral m meson, 
of positive, neutral, and negative sigma mesons, and of charged and 
neutral K mesons, etc.) present a serious challenge to modern physics 
and possibly point to the failure of quantum electrodynamics to give 
us a complete theory for calculating electromagnetic effects. We do not 
know whether it is truly quantum electrodynamics or our assumptions 
about the distribution of charge inside the particles which are at fault. 
Only when we have a more complete theory of these particles and their 





k max ~ 


Page 354 of F. Everling, et al., Atomic Masses of Nuclides for A < 70, 
Nucl. Phys., vol. 15, pp. 342-355, 1960. 
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interactions will we be able to determine the limitations, if any, of our 
present theory of quantum electrodynamics. 


THE LAMB SHIFT 


According to the Schrodinger equation, the second level of the hydrogen 
atom is degenerate. The 2s and 2p levels occur at the same energy. 
Likewise, for the Dirac equation there is a degenerate pair 2s1/2 and 
2p1/2. But Lamb and Retherford found in 1946 that there is indeed a 
small separation (about 1 part in 3 x 10°) with the 2s, /2 lying higher 
by a frequency of 1,057.1 megacycles. 

Although o reasoned that such an energy difference might 
arise from effects of the term J, the infinities of the divergent integrals 
confused all attempts to calculate the difference until the work of Bethe 
and Weisskopf in 1947. They reasoned as follows: 

First, since 


1 = 1 Eu-— En | 





RB. Bhs thn Bus. Dec BL 9.72 
Em j5 Eyn — hkc hke Em = Ein —hke Ake ( 
the energy (9.71) was expressed as the sum of three terms, 

ÔE pomas 6 E” + 8E” + ÂE” (9.73) 


where 


5E = e [zd (Em ~ En) (lpi => liv + p20 lv) dk 
m?c? er Eu — En — hike mE 











(2 
(9.74) 
Ire? ses ex d°k 
iB" = — Tog | ga allie Rae + ae Rad as (0-78) 
2re*h f1 ek 
ô ‘tt = = 
a mc k (27)° ean 


The term ôE” and the infinity from the Coulomb term d£, are 
independent of the state of the electron. They would (it was hoped) 
be made finite in some future theory. It would contribute some dm to 
the rest mass of the electron. If mo is the mechanical mass, the true 
experimental mass would be m = mp + 6m, where ôme? = bE!” + ô Ee. 
In the total energy (including the rest energy of the particles and the 
binding energy) of the hydrogen atom, such a rest-energy correction to 
the energy is, of course, expected, but we have already included it when 
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we measure all binding energies relative to the free-particle ionized state. 
The ôm term is thus identified, because it is the only term for an electron 
at rest, and it is independent of the motion or state of the electron. 

The term E” could be simplified, for the sum over N could be taken 
to give (p? + pż)mm (by the law of matrix multiplication). When k is 
integrated over all directions, this becomes 2 (p-p) MM, and 


sp" — _ŒP)mm Bre" lz d'k 
2m 3m J k? (27)8 


Again it was hoped that some day this term would be finite. It exists 
even for a free electron. It is interpreted as follows: The mechanical 
kinetic energy p*/2mp would be altered (if the mass is altered) to the 
expression | 


2 2 
La eee € = m) (9.78) 
2m 2mo Mo 
and the term 6£” must represent —(p*/2mo) d6m/mo. But we have al- 
ready taken this term into account; for we calculate the Schrödinger 
energy levels with p*/2m, where m is the experimental mass. The term 
is identified because it is the only extra term for a moving free electron 
and it is proportional to the kinetic energy.' Finally, even though these 
terms may be interpreted wrongly, when we calculate the difference of 
the values of ôE for the 2s and 2p states, the terms will drop out, be- 
cause 0E” and Ee are the same for all states and 6£” is also the same, 
since (p? /2m)mm turns out to be the same for the two states 2s and 2p. 
In the remaining term E” the argument was made that the dipole 
approximation would suffice. Then the matrix elements are independent 
of k, and since 





(9.77) 





: l d'k 1 hKrmaxC 
k? r a T 
/ k? Em — En —hke (27) 2n*he i Ey — En (9.79) 
we get 
2 
: i 2 2 ARmaxC 
0b = the pa [ mM — £n)3|Pwa|* In E, E, (9.80) 


Since the states and the matrix elements are known for hydrogen, the 
sum can be worked out. The only question is the value of Akmaxc. Bethe 


‘The m implied by Eq. (9.77) is (87e? /3c?) | (1/k?)d3k/(20)8 and is not equal 
to the ôm obtained from 6E/c? for a static electron. This is because we are limited 
to a nonrelativistic approximation. When a fully relativistic analysis is carried out, 
the two ways of calculating ôm agree. 
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argued that the nonrelativistic approximation is at fault here and that, 
if the full PEA calculation were ne Akmaxc would turn out to be 
of the order mc*. Putting hkmaxc = mec? gave about 1,000 megacycles, 
so Bethe knew he was on the right track. 

The remaining problem was to make a relativistic calculation with 
the Dirac wave function and states. Only in this way could a precise 
determination of the effective kmax be made. This turned out to be quite 
confusing, for it was hard to identify the various infinite terms. It would 
not do to simply cut them off at some maximum momentum and take the 
difference; for this is not necessarily a relativistically invariant procedure 
because it deals with momentum and energy in different ways. (One 
consequence of this has already been pointed out in the footnote.) One 
method for resolving the confusion was developed by Schwinger, who 
showed how the relativistic symmetry could be kept clear throughout 
the calculation and the infinite terms identified. Another method worked 
out by Feynman was to give a relativistically invariant procedure to cut 
off the infinite integrals. Here we shall illustrate the latter method. 

The total effect of the electromagnetic field, which this time includes 
the Coulomb interaction, is represented by an extra term J + Se in the 
action. The relativistic invariance of an expression for I like Eq. (9.64) 
will not be self-evident, since that formula is expressed in terms of k 
and t instead of either r and t or k and w. Let us represent J in terms 
of wave number k and frequency w variables. First note that, in light of 
Eq. (A.10), | 

, 
—ike|r| ,—iwt oe 2ikc 
l. e e = (oe fae (9.81) 
Or 
Oe ai a 2ikc piw(t—s) W (9.82) 
Do k2c? — w? — ie 2T 


Suppose we define 


j(k,w) = Í jc (thet? dt 


z J f ioe Aed rd (9.83) 


Then J becomes (for long time intervals T) 
lja (k, w)|? + lj2(k, w) |? dk dw 
O ke-w -ie (2r) 


The relativistic symmetry of this expression in k and w is already 


clear; for k*c* — w? is invariant to the Lorentz transformation. The cur- 


rents, however, do not appear in a relativistically symmetrical manner. 


(9.84) 
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We would have expected an invariant combination like j-j — c*p?, since 
j and cp form a four-vector. But if we define 


olko) = | pu(ther" at 


= J | oe, pemen d?r dt (9.85) 


the Coulomb portion of the action, Eq. (9.29), is 


e o w)|* = dw Z o, f PEE — wlel/k)* dk dw 
(Qn) kee? — w? Chas 
(9.86) 


the last resulting simply by multiplying numerator and denominator by 


c? — w*/k*, But the law of conservation of current 
Op 
aE s | 
j (9.87) 
becomes 
wp(k,w) = k-j(k, w) (9.88) 


Alternatively, if we call ją the component of j in the direction of k, 
wo/k = j3 and we have altogether 


I+ Se= (9.89) 


an f lji (k, w) |" + |Ja(k, w)/* + |js (k, w)|? — tolk, w)|? d'k du 
k2c? — w? — ie | (27)4 





The sum of the three j terms is just j*(k,w)-j(k,w), and the four-di- 
mensional invariance is evident. 

A suggestion is now made that in view of our present ignorance, 
convergence of the integrals can be made artificially by supplying an 
additional factor such as 


A? i 
(a — w? + A? — z) 


in the integrand of Eq. (9.89), where A is some very high frequency. For 
small values of k and w this factor is unity, whereas for high values it cuts 
off the integral. Furthermore, such a factor clearly does not destroy the 
relativistic invariance of the expression. All physical quantities are to 
be calculated by assuming I + Se contains this cutoff factor. If they are 
insensitive to A for large A (like the Lamb shift), the theoretical value 
is to be trusted. If, on the other hand, the result depends sensitively 
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on A (as does the charged and neutral 7 meson mass difference) no 
quantitative meaning can be given to the result, for the cutoff function 
is arbitrary and is not completely satisfactory. This is the present state 
of quantum electrodynamics. 


Problem 9-11 Show that the cutoff function is not completely sat- 
isfactory by arguing that ~y calculated in the manner of Sec. 9-4 would 
be altered by the cutoff, yet the probability of emission of a real photon 
would not be so altered because for such a photon w = ke and the mod- 
ifying cutoff factor is exactly 1. Thus the balance of probability would 
not result (i.e., the probability that the atom emits plus the probabil- 
ity that it does not emit would no longer add to unity). The difficulty 
suggested by this problem has never been solved. No modification of 
quantum electrodynamics at high frequencies is known which simulta- 
neously makes all results finite, maintains relativistic invariance, and 
keeps the sum of probabilities over all alternatives equal to unity. 


Problem 9-12 ‘Transform I + Se into space coordinates by using 


| ei(k-r—wt) ak dw 7 1 į 
k2c2 — w? — ie (Qr)4 (2r)2e (r2 — c?t2 + ie) 
1 
= — 6, (r? — et 9.90 
ie y ct") ( ) 


(Note: The function i/[r(x + ie)| is often written as 6, (x), and we have 
introduced that convention here.) Then find 


l 


I + Se = — J | entiez te) — c° p(r1, t1)p(r2, t2)] 


2C 
x 4 (|r1 — rel? — ° (tı — te)*) d'rı dt; d°rq dtz (9.91) 


THE EMISSION OF LIGHT 


In Sec. 9-4 we found an expression for the amplitude that the matter 
system would do something when interacting with an electromagnetic 
field, as shown in Eq. (9.60) and the following development. This deriva- 
tion was restricted to the special case that the field is both initially and 
finally in the vacuum state with no photons present. The result was that 
the action Smat in the path integrals must be replaced by an effective 
action Shat = Smat + J. In a more general case, photons are present, 
both initially and finally. As an illustration suppose that initially no 
photons are present but in the final state there is just one photon of 
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momentum ñq and polarization 1. The only change which this makes 
in our previous calculation is the change in the integral defining X, that 
is, Eq. (9.61). We shall now use 


x'= ff lS TT Dar x Daz (9.92) 


where the path integral is carried out between a vacuum initial state 
and a final state consisting of a vacuum plus one photon. Then every 
oscillator, except 1, q, goes from the initial state n = 0 to the final state 
n = 0, so the factor X1,, for all these oscillators is unchanged. Only the 
contribution from the single oscillator 1, q is altered; for it now becomes 


1 Tk = = -x 
A a = fæti | Vai aðna + hatha) 
Lay ee q C — * = hge = 
az g^a ~ 5 llaqlla T Ee it} Daiq (9.93) 


This expression is the same as Eq. (9.63) except that the oscillator path 
is taken between the state n = 0 and the state n = 1 instead of n = 0 
to n = 0 as in the previous expression. We worked out the behavior of 
a forced harmonic oscillator in Sec. 8-9, and we can use the results of 
that section to write 


on ft. , 
X! = |i — a aet dt | X 9.94 
ta (E f hadt) Xa (9.94) 


where X1 q is the n = 0 to n = 0 factor previously calculated. There- 
fore, evidently the complete factor X’ is simply the original factor X 
multiplied by 


to 
t 
jig dt 
V Tigo ENI 


and we find for thë amplitude 


. bb — , 
Amplitude = t4 / = [rome J jiq * dt Dx (9.95) 
| ba 


The perturbation theory expression which we previously evaluated 
(at Eq. 9.50) is equivalent to the transition element 


oT , be ! 
Hl ae J eli/F) Smas / haet dt Dax (9.96) 
ta 


so we see that the net result is the same as that given by the perturbation 
theory except that the transition amplitude must be calculated with the 
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effective action Shat = Smat + Z instead of with just Sinat. The effect 
of I is to change the energy levels a bit, as we discussed, but also to 
make the energy values complex. The result is that the emitted light 
gives a spectral line with a little width, which is called the natural line 
width. We shall not go any further into the details of this calculation but 
leave the subject and the generalizations to a number of photons both 
entering and leaving the system to those who wish to study quantum 
electrodynamics specifically in more detail. 


SUMMARY 


Review of the Approach. In this chapter we have a consider- 
able amount of analysis of the quantum electromagnetic field. It is 
worthwhile looking back again to see the central ideas and results. The 
separation of the Coulomb interaction and the use of running waves are 
technical ways of accomplishing our ends, but the essential result is the 
formula of Eq. (9.89) (or its equivalent, Eq. 9.91). Let us review this 
result from the more general point of view exemplified by the ideas of 


Eq. (9.1). 
Suppose we have a system which can be described by an action 
S = Sy [x] i S2 Ix, A, o| a S3 A, o] (9.97) 


where 5;[x] is the action of the matter alone, S2[x, A, œ] is the interac- 
tion of matter and field, and S3/A, ¢] is the action of the field alone, and 
where x stands for all the coordinates of the matter, while A, ¢ describe 
the field. Then the amplitude for any event results from evaluating a 
path integral like 


K= [ox E (S;[x] + So[x, A, dé] + S3[A, aD} Dx DA Do (9.98) 


subject to the boundary conditions of the problem in question. In this 
summary we shall assume that conditions for the field are that initially 
and finally no photons are resent (i.e., ground state to ground state for 
the field), and we abbreviate this set of conditions as gnd-gnd. Later on 
we shall consider the consequences of integrating x first and A, ¢ last. 
What we have done so far corresponds instead to integrating A, ¢ first 
and reserving the x integral as a subsequent step. 

Usually S2[x, A, ¢] is linear in the field variables A, ¢ and can be 
written as 


a / Í olr, t)ó(r, t) — j, t)-A(r,t)/c] dr dt (9.99) 


9-8 Summary 263 


where p and j are the electric charge and current density, which depend 
on the x only. The integral over A, ¢ is then easily performed because it 
is a gaussian integral. It is the burden of Eq. (9.91) to tell us the value 
of this integral, namely, 


gnd ; 
| exp fi sola, o] B Jj — j-A/c) dèr dt DA Do = elt /h)J 


nd 
(9.100) 
where J, which we called J+ Se in Eq. (9.91), is 
l Í : 
E J | ien ti)iEat) — ¢*p(ri,ti)p(ra, t2)] 
x 64(\ry = ra|? = eu aa Hr) dra dti dro dto (9.101) 


for any functions p, j of r, t. The expression for Eq. (9.101) as an integral 
over momentum space appears in Eq. (9.89). 

In the applications of Eq. (9.98) these p, j are some function of x 
and x, so we obtain the result that 


K (gnd, gnd) = [ex E (Si[x] + Ji) } Dx (9.102) 
where J[x], a functional of x(t), is given by Eq. (9.101) with the correct 
p, j substituted. This summarizes the results for gnd-gnd transitions. We 
express the modifying effect of the field on the action of the particles by 
the addition of J[x] to Si[x]. The central formula for electrodynamics 


then is the general result of Eqs. (9.100) and (9.101). 


General Formulation of Quantum Electrodynamics. It is also 
of interest to pursue these matters in a different direction, by integrating 
over the matter coordinates first, and leaving the field variables for later. 
We shall limit ourselves to a brief general description of what results from 
this procedure. If in Eq. (9.98) we contemplate integrating x first, the 
factor e’/")53 is a constant and can be left out. We can therefore write 
Eq. (9.98) this way: If we define 


T{A,4) = f exp { 5 (Sibel + Seix Ao) | Dx (9.103) 
then 
K= / e/MSIAAITIA J DA Do (9.104) 


This K gives us the amplitude that the particle goes through a cer- 
tain motion and the field undergoes a certain transition. Like all other 
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amplitudes, it is the sum over all possible alternatives. Each separate 
alternative is constructed as the amplitude T/A, ¢] for the motion in a 
particular field A, # times the amplitude e(/h)S3 that the field is A, æ. 
In carrying out the sum, we sum over all possible fields A, @. 

This law, given by Eq. (9.104), is the general fundamental rule for 
all of quantum electrodynamics. It is a correct formulation even when 
the functional T/A, @¢], the amplitude for the motion of the particles in 
an external potential A, ¢, cannot be represented as a path integral. 
For example, for a relativistic particle with spin (described by the Dirac 
equation), the quantity T|A,@] cannot be described by a simple path 
integral based on any reasonable action. However, it is possible to cal- 
culate T[A, ġ] by other means, for example, from the Dirac equation. 
After the form of this functional has been derived, the amplitude K can 
be worked out, in principle, from Eq. (9.104). 

In stating the law of quantum electrodynamics in the form of 
Eq. (9.104), we have isolated the behavior of the electromagnetic field 
from the behavior of the particle (or system of particles) on which it 
acts. That such an isolation can be carried out is an important result. 
For example, the functional T|A, 4] may represent the behavior of a nu- 
cleus whose properties are not completely known. However, if we know 
only the behavior of the nucleus in an external field, then we can solve 
quantum-electrodynamic problems involving nuclei. 

Of course, to use Eq. (9.104) strictly, T must be known as a functional 
of A, @ for all A, ọ, but this much information is rarely available. Even 
if it is available, the path integral over A, @ may not be easy. But in 
practice the formula is very useful. Sometimes T' can be approximated 
by an exponential, linear in A, ¢, of exactly the form of Eq. (9.99). Then 
the result is obtained directly from the general formula of Eqs. (9.100) 
and (9.101). More often, T can be represented by a sum, or integral, over 
such exponential forms with various p, j and the result of Eq. (9.104) is 
a corresponding sum or integral over expressions containing e@/B)I with 
J in Eq. (9.101) involving the corresponding p, j. 

In most practical situations T' can be expressed as a power series in 
A, ¢. The first few terms can be found from the theory of the matter 
considering A, ¢@ as a small perturbation. Subsequent substitution into 
Eq. (9.104) and integration over A, ¢ gives a corresponding perturbation 
expansion (in powers of e*/fc) for K. The necessary path integrals such 
as 


J e(t/R)Ss4,9] A (r1, t1) A;(r2,t2) DA Do 


= —the oj, 64 (\r1 2 rol” —¢ (ti B t2)”) 
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can be discovered by expanding the general formula of Eqs. (9.100) and 
(9.101) on both sides in powers of p, j and comparing corresponding 
terms. We shall not go further into these matters here, but refer the 
reader to the literature (e.g., sec. 8 of R.P. Feynman, Mathematical 
Formulation of the Quantum Theory of Electromagnetic Interaction, 
Phys. Rev., vol. 80, pp. 440-457, 1950). 
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Statistical Mechanics 


IN preceding chapters we have discussed transitions in which a system 
goes from one known state to another. In most physically realistic sit- 
uations the initial state is not completely known. ‘The system may be 
in one or another state with different probabilities associated with each. 
In this case the final state is equally uncertain, being that set of states 
resulting from the various possible initial states with the corresponding 
probabilities. Or we may not be interested in the probability to go to 
just one specified final state, but rather the chance to end up in any one 
of a set of such states. 

An especially interesting case of statistical uncertainty of states is 
that corresponding to thermal equilibrium at some temperature T. A 
quantum-mechanical system in thermal equilibrium can exist in one or 
another energy state. The results of quantum-statistical mechanics show 
that the probability that a system is in a state of energy E is propor- 
tional to e~2/ kIT where kT measures the temperature in natural energy 
units. (The conversion factor k, known as Boltzmann’s constant, is 
1.38065 x 10716 erg/K, or 1 eV per 11605 K.) 

In this book we shall neither derive nor discuss this exponential dis- 
tribution law. We emphasize that the energy E is the energy of the 
entire system. If an energy level is degenerate, then each state at that 
particular level has equal probability. This means that the total proba- 
bility for the system to have the particular energy value is enhanced by 
a factor corresponding to the number of states in the degenerate level. 

The exponential law given above is not yet a true probability distri- 
bution, since it has not been normalized. The normalizing factor can 
be written 1/Z, so that the probability that a system should be in the 
state of energy En (assumed nondegenerate for the time being) is 


l 
Pn = Fe ae (10.1) 


where 8 = 1/kT. This means 
LS `S e P En (10.2) 
n 
An equivalent normalization consists of defining an energy F such 
that 
P= e PlEn-F) (10.3) 


F is called the Helmholtz free energy. Its value is, of course, dependent 
on the temperature 7’, although the various energy values £, do not 
depend on T. It is evident that 


Bae (10.4) 
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THE PARTITION FUNCTION 
The physical properties of a system in thermal equilibrium can be de- 


rived from the exponential distribution function. Suppose A is the mea- 
sure of some property and that its mean value in the nth energy state 


ihc in pi Apn dT (10.5) 


where the integral is taken over the configuration space of the system. 
Then the statistical average for A for the whole system is 


= Y` prá = > J Ane PE (10.6) 


For example, the average or expected value of the energy itself is 


U = N PnEn = =) Be. eo 
n n 


2 D (10.7) 


If the normalizing factor Z is known as a function of the temperature, 
the sum of Eq. (10.7) can be easily evaluated. From Eq. (10.2) we have 


OZ OZ 
—BEn _ _ pe 
D i -35 S (10.8) 
This means that 
kT? ƏZ ,olnZ OF 

a a eat; S 

_ OBF) 

= (10.9) 


We have written the derivatives with respect to the temperature as 
partial derivatives because other variables, such as the volume of the 
system or any external fields, which determine the energy levels are all 
held fixed. 

It is interesting to see what happens to the expected value of the 
energy if some other variable such as the volume is changed. Suppose 
the system is in a particular state ¢, and we make a small change in 
the value of a certain parameter, say a. Using a first-order perturbation 
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principle, we find that the first-order change in energy is equal to the 
expected value of the first-order change in the hamiltonian. That is, 


E, + AEn = [oa + AH)é, dt 
oe / $“ AHọ, dv (10.10) 


Using the language of classical physics, we would say that the ratio 
—AH/A^a is the “force” associated with the parameter a. In case this 
parameter is the volume, the force is the pressure. That is, we define 
the concept of force by 


force x change in parameter = —change in energy 
or 
OH 
= ——— 10.11 
As an example, then, if P = pressure and V = volume, 
-P AV = AE (10.12) 


We write the expected value of the force as 
_ OH OH OE n, 
j 5-2 Pn Mae) pna (10.13) 


7 i. Z da Ze Z ða 





so that 
2 1 ðln Z 
y = 3 Aa (10.14) 


where 8 and other parameters are held constant. Using Eq. (10.4), we 
can write this as 


= oF 

If the parameter a is the volume V so that f, is the pressure P, we have 
OF 

Poa : 
5y (10.16) 


When the volume is changed by an infinitesimal amount for a sys- 
tem at a constant temperature, two things happen simultaneously. First, 
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each energy level shifts slightly. Second, for the system to stay in equilib- 
rium at a constant temperature T (maintained by a bath, for example), 
the probability associated with each energy level changes slightly (be- 
cause the energy of that level changes). If the only effect were a change 
in the energy of each level, then the change in the total energy of the sys- 
tem would correspond to this change averaged over all the levels. From 
our foregoing discussion this is the negative of pressure times the change 
in volume. However, to keep the temperature fixed, some readjustment 
of the probability of each level must occur. Thus the total energy must 
make an additional change which we will call dQ. This additional en- 
ergy comes from the external system (the bath) which maintains the 
temperature, and it is called the heat exchanged. ‘Thus 


dU = -PdV +dQ (10.17) 


We can find dQ easily from the expression for U given in Eq. (10.7). 
When V is altered by the change dV, then each energy level En un- 
dergoes the change dEn and the Helmholtz free energy changes by an 
amount dF’. Thus the total energy changes by the amount 


= 2 ge E (10.18) 
+ BaF OE, gO) y Pane hee 


The first term in ‘is expression is the expected value of dEn, which is 
—P dV, as we have already explained. The remaining two terms consti- 
tute dQ. These two terms can also be expressed with the derivatives of 
the sum in Eq. (10.2), and ultimately in terms of F. In fact, we find 


ð? F 
= — e e ! 1 
dQ = -Taray dv (10.19) 
That this is true can be seen also from Eq. (10.17), which gives 
dQ ou O OF OF 
av ov * = sy (F- a" 3V 
0° F 
PEOI 10.2 
"Or OV poner 


Equation (10.19) gives the heat exchanged dQ in changing the volume 
by the amount dV while keeping the temperature constant. If we change 
any other parameter, we shall arrive at an analogous result. For example, 
if we change the temperature T while holding the volume V constant, 
the heat exchanged is equal to the change in total energy. That is, 

2 
= O ( P pet O*F 


dQ = dT = F =) dT = -T aT (10.21) 
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In general, then, we have the result 


0°F 0°F O° F 


= -T (grap V + gr et Far 


The right-hand side of Eq. (10.22) is of the form T times the total 
change in a quantity S = —(OF/0OT), which is called the entropy. That 


(10.22) 








dQ =T ds (10.23) 
OF 
eee 10.24 
OT on 
U=F+TS (10.25) 


It is evident that all the standard thermodynamic quantities — inter- 
nal energy, entropy, pressure, etc. — can be evaluated if a single function, 
the partition function Z, is known in terms of the temperature, volume, 
external field, etc. The thermodynamic quantities are obtained simply 
by differentiating Z or, equivalently, the free energy F. 

The determination of some physical quantities, even for a system in 
thermal equilibrium, requires more information than only the partition 
function. For example, suppose the system is in a configuration space 
with a coordinate x and we ask: What is the probability of finding the 
system at location xz? We know that if the system is in the single state 
defined by the wave function n(x), the probability of observing x is the 
absolute square of the wave function, ¢7,(x)@, (xz). Thus, averaging over 
all possible states, the probability of observing z is 


P(2) = 5 > $5 (w)bn (w)eP™ (10.26) 


In the general case, if we are interested in any quantity A, then the 
expected value is given by 


z 1 
Ā= Anew Pr = P> J 6" (a) Ad, (x) dz e722" (10.27) 


It is evident that the expected values of all such quantities could be 
obtained if we knew the function 


plz’, 2) = 7 ba (2!) 64 (aye (10.28) 


This suffices since the function A appearing in the integral of Eq. (10.27) 
is an operator which operates only on the @, (x) of that expression, and 
not on ¢* (x). Using the quantity p(x’, x), we can imagine A to act on 2’ 
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only, after which we set x’ equal to x in the form Ap(x', x), and finally 
integrate over all values of x. This process is called finding the trace of 
Ap. | 

From the definition of p(x’, x) it is clear that 


P(x) (£, x) (10.29) 


and since the probability P(x) is normalized, so that the integral over 
all of x gives 1, we have 


N | oaa) dx = trace{ p} (10.30) 


The quantity p(x',x) is called the density matrix. [More precisely, it 
is called the “statistical density matrix for temperature 7”; the term 
“density matrix” also has a wider use for general systems in or out of 
thermal equilibrium and is usually used for the normalized version of our 
function p(x’, x), that is, for the function we would write as p(x’, £)/Z.] 
The general problem of statistical mechanics is to evaluate Eq. (10.28) 
to find the density matrix. If we are interested only in conventional 
thermodynamic variables, we need only the trace or diagonal sum of the 
density matrix, which gives us the partition function Z. 


THE PATH INTEGRAL EVALUATION 


The formulation of the density matrix given in Eq. (10.28) bears a close 
resemblance to the general expression for the kernel, which was derived 
in Chap. 4 and given in Eq. (4.59) as 


KGW ES E Ge ae - (10.31) 


The validity of this expression is restricted to situations in which the 
hamiltonian is constant in time and tẹ > ta. However, this situation is 
implied in statistical mechanics; for only if the hamiltonian is constant 
in time can equilibrium be achieved. The difference between the form of 
Eq. (10.31) and that of Eq. (10.28) is in the argument of the exponential. 
If the time difference t, — ta of Eq. (10.31) is replaced by —2h, we see 
that the expression for the density matrix is formally identical to the 
expression for the kernel corresponding to an imaginary negative time 
interval. 

We can develop the similarity between these two expressions from 
another point of view. Suppose we write the density matrix in a way 
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which makes it look a little bit more like a kernel, thus, k(£., Ub; Za, Ua) 
for p(£b, Za), where 


(stb, Ub; La, ta) = > bn (20) P% (aa)e” oH /A1En (10.32) 


Then if zb = 2’, £a = T, up = AG, and ua = 0, Eq. (10.32) becomes 
identical with Eq. (10.28). 
If we differentiate k partially with respect to uy, we get 


Ok(6, a) 1 —-[(us—ua)/ħ]E 
sS * (p, J)e- [(wo—tta) RIE 10.33 
Our h ~ E On (210, (£ Je ( 0 ) 


But now recall that Enn(zb) = Hon(ze); so if we understand Hy to 
imply operations only upon the variables x,, we can write 











Ok(b, a) 1 
= ——Hyk(b 10.34 
Fen = z Hok(b, a) (10.34) 
or, to put the same thing another way, 
Op(b, a) 
= —H 10.35 


We notice that this differential equation for p is similar to the Schrö- 
dinger equation for the kernel K which was developed in Chap. 4 and 
given in Eq. (4.25). We can rewrite it here as 
OK (b 
asin | = -Ż H,K(b, a) for te > ta (10.36) 

Oty A 
We found in Chap. 4 that the kernel K(b,a) is Green’s function for 
Eq. (10.36). In the same sense the density matrix p(b,a) is Green’s 
function for Eq. (10.35). 

With simple hamiltonians involving only momenta and coordinates 
we have been able to write the kernel as a path integral. For example, in 
a one-particle, one-dimensional situation where the hamiltonian is given 
by 

hed? 
H = —-—-——~ 10.37 
2m dx? eV ( 


the solution for the kernel over a very short time interval 
tp m ba = € 
is 


K(b,a) = (je) R Ẹ ga =e nw | 
(10.38) 
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which can be directly verified by substitution into Eq. (10.37) and taking 
the limit €e — 0. By building up a product of many kernels of the 
form (10.38), and summing over paths, and taking the limit as the time 
interval € goes to 0 and the number of terms in the product becomes 
infinite, we have produced a path integral describing the kernel over a 
finite period of time. | 

We can produce a solution to Eq.(10.34) in the same manner. The 
solution for an infinitesimal interval of u, — ua = 7 is given by substi- 
tuting € = —i7 into Eq. (10.38). Thus 


k(xp,7;La,0) = (10.39) 


Ly 2 9 
m | E pap ea) Tia 
Gam) ial a t (AE 8) 


That this is a valid solution of Eq. (10.34) in the limit 7 — 0 can be 
demonstrated by direct substitution. 

The rule for the combination of functions defined for successive values 
of u is the same as the rule for the combination of kernels for successive 
intervals of time. That is, 





k(b, a) = fre Clk cya) dze (10.40) 


That this result still holds follows from the fact that Eq. (10.33) is a 
first-order derivative in u. This rule can be used to obtain the path 

integral to define k(b, a) as 
kpc Was in) (10.41) 
ied 


$f. fol ts i y [peaa a ave | Ll 


i=] 


x. 





The normalizing constant a now becomes 


: pa (10.42) 


m 





and the integral is carried out over all paths going from za to zẹ (that is, 
for i = 0, zi = Ta and for i = N, x; = £») in the interval uy — ug = Nn. 

The result of this derivation is that if we consider a “path” z(u) as 
a function which gives a coordinate in terms of the parameter u, and if 
we call ¢ the derivative dx/du, then 


p(Zb, Za) = fej [ EE *(u) +V (z(u )) 7 Dæ(u) (10.43) 
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This is a very amusing result, because it gives the complete statistical 
behavior of a quantum-mechanical system as a path integral without the 
appearance of the ubiquitous 2 so characteristic of quantum mechanics. 
(Incidentally, this is not so for a system moving in a magnetic field.) This 
path integral of Eq. (10.43) is much easier to work with and visualize 
than the complex integrals which we have studied previously. Here, it 
is easy to see why some paths contribute very little to the integral — 
these are the paths for which the exponent is very large and thus the 
integrand is negligibly small. Furthermore, it is not necessary to think 
about whether or not nearby paths cancel each other’s contributions, 
since in the present case all contributions add together with some being 
large and others small. 

The parameter u is not the true time in any sense. It is just a 
parameter in an expression for the density matrix p. However, if we wish 
to think through analogy, we can consider u as the time for a certain 
path, an in so doing we can state the result given by Eq. (10.43) in a 
vivid pictorial way. What we are doing is providing a physical analogue 
for the mathematical expression. We shall call u the “time,” leaving the 
quotation marks to remind us that it is not real time (although u does 
have the dimensions of time). Likewise ¢ will be called the “velocity,” 
mi’ /2 the “kinetic energy,” etc. Then Eq. (10.43) says that the density 
matrix for a temperature 1/k@ is given in the following way: 

Consider all the possible paths, or “motions,” by which the system 
can travel between the initial and final configurations in the “time” Gh. 
The density matrix p is a sum of contributions from each motion, the 
contribution from a particular motion being the “time” integral of the 
“energy” divided by fA for the path in question. 

The partition function is derived by considering only those paths in 
which the final configuration is the same asthe initial configuration, and 
we sum over all possible initial configurations. 


Problem 10-1 Show that the density matrix for a one-dimensional 
harmonic oscillator is 


1/2 
; MW 
2 -x nr a e a a a 10.4 
pe, 2) (z sinh Fa} lai 


MLW 12 2 / 
xX — — |T “+2 h Bhw — 2r x 
Coa 2h sinh Bħw ( ) ai } 


This answer can be compared with the results of Prob. 3-8. Show also 
that the free energy is kT ln[2 sinh(hw/2kT)]. Check this latter value by 
a direct evaluation of the sum of Eq. (10.2). 
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The Classical Approximation. If the temperature is not too low 
(how low is too low will be discussed below Eq. (10.49)), GA is very small. 
Thus, in calculating the partition function for which x» = £a, each path 
starts from za and in a very short “time” is back at za again. In fact, 
the paths cannot ever wander very far from za, because traveling far 
away and returning again in the short “time” available requires a high 
“velocity” and a large “kinetic energy.” For such a path the exponential 
function appearing in Eq. (10.43) becomes negligibly small, and it will 
contribute a negligible amount to the sum over all paths. Under these 
circumstances the paths z(u) which must be considered in evaluating 
V (x(u)) never move very far from the initial point £a. Thus to a first 
approximation we can write V(z(u)) œ% V(za) for all paths. In this 
approximation the potential energy is independent of the path, and the 
exponential function dependent on the potential can be taken outside 
the integral. Thus for temperatures which are not too low we have 


La 1 m Gh 
P Casta) eM) | exp -F t (u)du > De(u) (10.45) 
Ta 0 


In this last expression the path integral is that for a free particle. It 
can be solved in the same way that we solved the path integral defining 
the kernel for the motion of a free particle in Chap. 3. The result is 


Lb m Oh 
/ exp -5 t (u) du > Da(u) 
mkT mkT (x, — La) 
= oof e o 


If we are interested only in the partition function, we set £e = Za and 


find — 


pltta; ta) = q| may eV) (10.47) 


Then, the partition function is the integral of this expression over all 
possible initial configurations £a. Thus 


Z = Ta / e PV (2a) dza (10.48) 


This is a formula for the partition function valid in the limit of classi- 
cal mechanics. It was originally derived, within an uncertain multiplying 
constant, as a consequence of classical mechanics by Boltzmann. In more 
complicated cases (e.g., more variables) the classical partition function 
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is simply the product of two factors. The first of these is the path in- 
tegral which one would get by considering all particles of the system to 
be free. The second factor is called the configuration integral, and it is 
just the integral of e~°”, where the potential energy V of the system 
depends upon all of the N variables describing the system. For example, 
for N particles interacting by a potential V(x), X2,..., XN), where x; is 
the position vector of particle 7, the integral required is 


J f [exp {6V nxan} d°x, d’X2 ++: d°xy 


This simple form for the partition function is only an approximation 
valid if the particles of the system cannot wander very far from their 
initial positions in the “time” OA. The limit on the distance which 
the particles can wander before the approximation breaks down can be 
estimated from Eq. (10.46). We see that if the final point differs from 
the initial point by as much as 

A 

mkT 
the exponential function of Eq. (10.46) becomes very greatly reduced. 
From this, we can infer that intermediate points farther than Ax away 
from the initial and final points can be reached only on paths which 
do not contribute greatly to the path integral of Eq. (10.43). If the 
potential V(x) does not alter very much as x moves over this distance, 
then the classical statistical mechanics is valid. 

For a typical solid or liquid at room temperature, with an atomic 
mass of about 20, for example, Az is about 0.1 A, while the interatomic 
distances and forces range over one or two angstroms. Thus motions 
greater than 0.1 A will not contribute to the density matrix, while the 
potential function will remain unchanged until motion of about one or 
two angstroms has been achieved. It is clear that classical statistical 
mechanics is adequate for such materials. 

All of the mysterious transformations between solids, liquids, and 
gases ordinarily lie in a range where classical statistical mechanics is 
valid. The mathematical interpretation of all these processes is con- 
tained in the problem of evaluating the integral of e~°” over the coor- 
dinates of all the atoms. That this amazing variety and peculiarity of 
phenomena comes from just a simple integral is at first surprising, until 
it is realized that the integral is a multiple integral over a stupendous 
number of variables. Our usual experience with integrals which involve 
one or, at most, a few variables of integration does not prepare us for 
the almost qualitative differences that can arise when the number of 
variables approaches infinity. 


OS 





(10.49) 
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The fascination of the problems in the theory of the solid states, or 
of liquids and of the condensation of gases, lies, like the behavior of 
this multiple integral, in the way in which simple descriptions of simple 
systems when joined together in enormous multiplicity yield such rich 
phenomena. It is a challenge to the imagination to see how the coop- 
eration between systems can lead to such results. A rough qualitative 
explanation is readily forthcoming for many of these effects, but the 
problem of quantitative detail also holds fascination for the theoretical 
physicist. | 

There are important statistical phenomena which occur when the 
classical approximation is not valid. In this case the multiplicity of 
variables is compounded with the conceptual complexity of quantum 
mechanics to raise even greater challenges. 

Strictly speaking, Eq. (10.48) does contain a little more information 
than was available to the purely classical statistical mechanics. This is 
evidenced by the appearance of ÅA in the coefficient in front of the in- 
tegral. Classical mechanics could not determine the partition function 
absolutely, but only within an unidentified constant factor. Thus the 
logarithm of the partition function was determined only to within an 
additive constant. That meant that a term proportional to T appeared 
in the expression for the free energy, or an additive constant appeared 
in the entropy. This constant, which was sometimes called the chem- 
ical constant, could be completely evaluated only after the quantum- 
mechanical solution was worked out. 


QUANTUM-MECHANICAL EFFECTS 


There are some cases in which the classical approach is not adequate. 
For these cases it is necessary to include changes in the potential function 
which result from the motion along the “path.” In this section we shall 
calculate the first-order effect of the potential when the motion of the 
particle is taken into account. 

Instead of approximating V(x) by the constant value V (xa) in the 
expression for the density matrix, Eq. (10.43), we might try a Taylor 
series expansion for V(x) around the point £a. However, we would find 
that we could save effort and increase our accuracy if we chose to expand 
aout the mean position given by 


Bh 
Ts zl x(u) du (10.50) 


which is defined for any particular path. We can characterize each path 
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by its mean position and carry out integrations over all such positions in- 
stead of integrations over initial positions £a, as was done in Eq. (10.48). 
In this way the partition function becomes 


Z = T exp 13 [ $2? (u) + V(2(u))| au} D'x(u) dz (10.51) 


In this expression the paths are chosen to satisfy two conditions: (1) that 
z given by Eq. (10.50) is fixed and (2) that the initial and final points 
are identical. This implies that the integral over all paths must also 
include an integration over all end points za, and that is the meaning 
of the notation D’ in the differential. 

Using a Taylor series expansion for V(x) about the point z, we find 


Oh 
| V(x(u)) du = (10.52) 


Bh 
BAV (T) +f (x(u) — £)V’(#) du + / (z(u) — B)°V" (%) du+--- 
0 0 

By virtue of Eq. (10.50) the second term on the right-hand side of this 
last equation is zero. Thus, by expanding about the mean position, we 
arrive at an expression for which the first nonzero correction term is of 
second order. Using this expansion and including no terms of higher 
order than the second, we have for the partition function 


Zx J ETEY (10.53) 


x f exp 13 [ Feu) + 5 (elu) ~ 22v") a D'x(u) dz 


The path integral in this expression differs from those of our previous 
experience in one particular way. ‘The paths over which the integral is 
to be evaluated are constrained by Eq. (10.50), which can be rewritten 
for present purposes as 


i eee 

ax | (x(u) — 7) du = 0 

The substitution y(u) = z(u) — Z as the path coordinate then gives the 
constraint in the form 


BE 
Bh | y(u) du = 0 
and the path integral itself is 


im exp 5 [ Siw F = vO] a D'y(u) (10.54) 


Bh 


i 


ae 
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The integrand of this path integral is the same as that for a harmonic 
oscillator, with the frequency given by w? = V” (0)/m. 

We now apply the constraint to this path integral in the following 
way. We multiply the whole path integral by the Dirac delta function 


Bh 
ô (ae f y(u) a 


In order to manipulate the delta function within the path integral, we 
express it by its Fourier transform 


conf 


and write Eq. (10.54) as 


ae GOR Ton 1 ik dk 
Ty av" (Oy? E] dus Dy(u) = 
f J =p- F g + 5v (0y | u) y(u) = 


(10.55) 





In this form, the path integral contains the constraint of Eq. (10.50), 
and we can drop the prime on D’ and proceed directly with standard 
path integral techniques to obtain the desired solution. We note that the 
integrand of the path integral now has the same form as the path integral 
for a forced harmonic oscillator if we interpret both m and V"(0) to be 
imaginary. However, we are interested only in the case V” (0) small, and 
the approximation of including only the first-order term V’’(0) can be 
made at any convenient stage. 


Problem 10-2 Use the methods of Chap. 3 and, in particular, 
Eq. (3.66) to solve this path integral. Remember that paths of interest 
in this problem have the same initial and final points and that completion 
of the path integral requires an integration over all values of this point. 
Finally, carry out the integration over k to get as a solution 


Bhu /2 B27? 


t ——————_ 
eon sinh(Ghw /2) 24m 


= const f — — V" (z) +. 7 (10.56) 


The partition function which results from the solution obtained in 
Prob. 10-2 is written best in the form (valid to first order in V”) 


z= [me apd -A s |v 7) + BR yng j|} az (10.57) 


Here the unknown constant has been evaluated simply by comparison 
with the classical result of Eq. (10.48). We see from this result that 
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the partition function has the same form which we derived under clas- 
sical assumptions. The only difference is that a temperature-dependent 
corrective term has been added to the potential. This corrective term, 
(Bh? /24m) V" (Z), is clearly quantum-mechanical in nature, as can be 
seen from the inclusion of Planck’s constant A. 


Problem 10-3 Show that for many particles (which we identify by 
subscripts so that the mass of the ith particle is m;) moving in three 
dimensions, the correction to the potential is 





Brn” | eos 
T D (10.58) 


In practice, the results of this calculation are not very useful. In 
most problems, e.g., in a gas of colliding molecules, the potential rises 
very sharply so that there is a violent repulsion at small distances. In 
such a case the second derivative is very large. When this is not so, the 
formula may be of some use. It has one advantage in that it may be 
easily extended to another order of accuracy. 


Problem 10-4 Show that the correction to the partition function 
up to the order of ht contains the factor 


734K" Bent 


T7H 7 H=N 12 WN f= 
3 x 720m? | (7)] ~ 34 x 48m? 2) Fo 


The Effective Potential Method. We have seen above that 
quantum-mechanical effects might be represented by calculating the par- 
tition function exactly as in the classical formula of Eq. (10.48), but 
instead of using the correct potential V(x), we use a modified potential 
V(x) + (Bh? /24m)V" (x). This suggests that we try to go further and 
seek some possibly better effective potential U(x) which, when substi- 
tuted for the true V (x) in the classical equation (10.48), would represent 
an even better approximation to the correct quantum-mechanical parti- 
tion function. 

We start out with the exact expression 


Z = j ESP VAR) (10.59) 


m bh Bh 
x feo} 2 f a iu = | VEW) -Vedu D'x(u) dz 
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The path integral within this expression is related to an average over 
paths z(u). To be specific, for any functional g of z(u), the weighed 
average of g over all paths starting and ending at the same point, and 
with average value Z, weighed by exp{—(m/2h) f t? dz}, is 


[ox -n a t’ | g|x(u)| D'z(u) 
[exp 3 i? au} D'x(u) 


We call the denominator B(Z), so the path integral within Eq. (10.59) 
is B(Z) (ef !®™I) _. where 


Bh 
flew) = -7| VEW) -V@au (10.60) 


If we were to replace this average of an exponential with the expo- 
nential of an average, thus 
(ef) — et) (10.61) 
we know we would make an error of the second order in f or, better, 
of the order of the difference between (f)? and (f2). We shall see at 
Eq. (11.6) that we can determine the sign of this error, i.e., the left- 


hand side is greater than the right. The exact and approximate partition 
functions are then 


L= | OBE) and Z = | OBE Ax 


To evaluate the path integral 


m Qn 
(fe = aw | 1-3 T au} (10.62) 


Bh 
x =; | [V (e(u)) — V (2)] du 


we first change variables to the paths y(u) = z(u) — z where 


D'x(u) 








i fe 
O= =Y¥ zj ylu)au=0 (10.63) 
so that 
1 i eee m fer 
(Fs =< B(z) EF. [eos Fe | du) 





x [V (2 + y(t)) — V(e)] D'y(u) J 
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Second, we define the related path integral 


m Bh 
ra= fæl- | jžau) V(E + y(t)) —V(®)]D'y(u) (10.64) 


where t is some specific value of u between 0 and ØA. It is clear that 


Gh 
-> | 1(2) a 


and that B(T) is just [(Z) in the special case [V(z + y(t)) — V(Z)| = 1. 

At first glance it appears that I (Z) is a function of t, but the following 
argument shows that (z) is in fact independent of t. Suppose each path 
in the integral is not of finite length, but is really a Gh-length segment of 
a periodic path whose period is 2h, as shown in Fig. 10-1. Consider two 
of the family of all such paths, one y(u) and the other y(u + tı) = yı (u), 
as shown in Fig. 10-2. The value which the first attains at u = ti, 
namely, y(t), is reached by the second when its argument is 0, that is, 
y(tı) = y,(0). Furthermore, for any other point t; there exists in the 
family the analogous function y;(u) for which y(t;) = y;(0), and all such 
paths give the same contribution to 


| ” y” (u) du 


Of course, all these statements apply to each path included in the path 
integral. Thus we see that we lose nothing by arbitrarily setting t = 0 
in the path integral over all paths y(u), which is the same as saying that 
the integral I (Z) is independent of t. 


e= Be 


z) 





Problem 10-5 Using the method outlined above Prob. 10-2, and 
Eq. (3.62), show that 


1(%) = =i, j e7 Y" (6m/B V (z + Y) — V(z)] dY (10.65) 


6 m m/ AK E 7 
(Ns =~ By] ao J e eI (Eas V aY VT 


Suppose we call our approximation to the partition function Z’ and 
the corresponding Helmholtz free energy F’, so that Z’ = e~P” . We 
then have 


Z' = J e PV (E)N) B (7) di (10.66) 





10-3 Quantum-mechanical effects 285 


y(u) | 


fi 
VAAN 


Fig. 10-1 All paths which return at u = Øħ to their initial value (at u = 0) can be 
considered as Gh-length segments of periodic paths where the period is Gh. 





y(u) y(u) 


Fig. 10-2 Suppose one of the “periodic” paths y(u), as shown in Fig. 10-1, has the 
value y(tı) at u = tı. Then the collection of all “periodic” paths must contain this 
same path slipped left a distance t1, that is, y(u + tı), which will have this same value 
at u = 0. The result of a path integral average over all such paths must then be 
independent of the selection of the initial point on the u axis. 


The factor B(Z) was evaluated in Eq. (10.46), and we have 


oto fe ~BU(®) dz (10.67) 
rh? 


where 


U(z) = si J eV (6m/BRDV(F + y) dy (10.68) 
Toh 
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The term V(%) has cancelled out. 

These results mean that we can calculate an approximate free energy 
F’ in a classical manner (i.e., using an expression like Eq. (10.48)) and 
get a good approximate result if we use an effective potential U (z), as 
defined by Eq. (10.68), in place of V (z). Incidentally, we note that the 
effective potential is temperature-dependent. 

The effective potential is a mean value of V(x) averaged over points 
near Z in a gaussian fashion where the root-mean-square spread (or 
standard deviation) of the gaussian weighing function is (8h? /12m)*/?. 
Furthermore, if we follow through the various inequalities which are 
involved in our approximation, we find that the approximate free energy 
F’ exceeds the true free energy F. The details of this are discussed in 
the next chapter, at Eq. (11.9) and following. 


Problem 10-6 Show that the relation of Eq. (10.68) becomes the 
“corrected” potential of Eq. (10.57) (that is, the argument of the expo- 
nent in that equation) if V is expanded as a Taylor series. 


Problem 10-7 Test the validity of the approximation as it applies 
to the harmonic oscillator, for which the exact value of the free energy 
18 


Aw 
Pose = kT l 2 j h —— . 
t n | sinh ard (10.69) 


Evaluate the approximate value for the free energy by means of the 
effective potential U. Show that 


mu? { . BR 
and that 


2 
In as ae oa (5) (10.71) 


| ines 
approx = RI kT 24\ kT 








Determine the free energy or, better, the ratio of the free energy to kT, 
for various values of the frequency. It is suggested that the values of 1.0, 
2.0, and 4.0 be used for the ratio hw/kT. Show that F” is greater than 
F, as expected, and that the errors grow as the temperature falls. Note 
that if we are even very far from the classical region (e.g., where the 
ratio hw/kT = 2.0, so that the system has an 85 per cent probability of 
being in the ground state) the approximate results are still surprisingly 
close to the true results. 

Compare these results with those obtained through the classical ap- 
proximation in which the free energy is given by kT In(hw/kT). Your 
results should show the values given in the accompanying table. 
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10-4 SYSTEMS OF SEVERAL VARIABLES 


If a system has several variables, the formulas describing them are ob- 
tained by direct extension of the methods we have already studied, ex- 
cept for some special problems which arise from consideration of sym- 
metry properties. 


Liquid Helium. As an example consider the problem of finding 
the partition function for liquid helium. Suppose we have N identical 
atoms, each of mass m, confined in some volume. Suppose further that 
atoms interact in pairs through a potential V(r;2). This potential is a 
weak attraction at large distances and a very strong repulsion at short 
distances. Just to orient our thinking, we might imagine V(r) as the 
potential for hard spheres. That is 


0 r>a l 
rue la r<a SITA e 


The lagrangian for such a system has the form 
m l ls 
L== + |2 — 5 3 V (ri;) (10.73) 


which means that the partition function is 


l x; (0) ilm Bh | R l 


1 eh T3N qoX 
aD V(lxi(u) — x; (u)|) du x(u) 2 x(0) 


Here the symbol D3N x(u) stands for D?x;(u) D°x2(u)---D°xy(u) and 
similarly d?%x(0) means d*x,(0) d?x2(0)---d°xy(0). The path integral 
is performed over paths taken between initial points x;(0) and final 
points x;(Gh) such that x;(Gh) = x,(0). 

The form which we have written down in Eq. (10.74) is actually not 
correct. The symmetry properties which we mentioned above will affect 
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this result. This characteristic is one of the interesting features of the 
quantum mechanics of identical particles. In Sec. 1-3 we mentioned that 
if an event occurs in two indistinguishable ways, then the amplitudes 
for the two ways will add. In particular, when we are dealing with 
indistinguishable particles, one alternative way for accomplishing any 
event always exists; namely, the interchange of two particles. In such 
a case the amplitudes for the particles (1) as interchanged and (2) as 
not interchanged must be added. (This addition rule applies to Bose 
particles. For Fermi particles the contributions for amplitudes which 
arise from odd permutations of particles will subtract from each other.) 
Ordinary helium atoms are of isotopic mass 4 and contain 6 particles: 
2 protons, 2 neutrons, and 2 electrons. This means that helium atoms 
are Bose particles and the amplitudes for interchange of particles add. 
(For instance, we say that Bose particles follow symmetrical statistics, 
whereas Fermi particles follow antisymmetrical statistics. ) 

To see how this addition of amplitudes comes about, at least for 
helium atoms, we can follow this line of argument: In the final state 
the atoms cannot be distinguished from each other. Thus, although the 
appearance of the configuration of atoms may be the same finally as it 
was initially, the identity of some of the atoms may have been exchanged. 

For example, an atom which we shall designate as 1 starts at position 
xzı(0). We have assumed that some atom at least will be in this same 
position at the close. Thus, for some atom 2z(@h) is equal to zı(0). 
However, it may not be atom 1 which ends up in this particular place. 
Instead, atom 1 may go to the initial position of atom 2, say £2(0), while 
at the same time atom 2 has moved into the initial position of atom 1. 
That is, it is possible that atoms 1 and 2 exchange places in the final 
configuration. 

To describe this situation in the most general terms, let Pz; stand 
for some permutation among the atoms which are initially at x;. Thus, 
for example, in the situation in which atoms 1 and 2 were exchanged 
and all others remained where they were, we would have 


Pri = Xo Pro = ři Pzt = £3 TE Pinan (10.75) 
In general, the final state can be any permutation of the initial state: 
wn) = PaO) (10.76) 


Thus in order to construct the complete amplitude, we must sum am- 
plitudes over all the N! possible permutations, since each permutation 
represents an alternate possibility. The normalization is correct if we 
average over all the permutations. The resulting rules for symmetrical 


10-4 Systems of several variables 289 


statistics mean that Eq. (10.74) must be replaced by 


1 Px; (0) 1 im pe o 
z-a A RE koa aom 
ae p r V (|x;(u) — x;(u)|)du| > D?” x(u) d?x(0) 


where X means a sum over all permutations P. 


If T4 were dealing with Fermi particles, e.g., the isotope of helium 
which has three nucleons, we would have to include an extra factor of 
+1, positive for even permutations and negative for odd permutations. 
There would also be some extra features which depend upon the spin of 
the atom in our result. 

It is possible to give a more detailed derivation of Eq. (10.77) in the 
following manner. For helium-4 atonis the quantum-mechanical ampli- 
tude for two atoms which start at positions a and b to get to positions 
c and d is 


K(c,a;d,b) + K(d,a;c,b) (10.78) 


(Amplitudes for alternative final conditions add, since these conditions 
cannot be distinguished from each other.) In this expression K (c, a; d, b) 
is the complex amplitude for one particle to go from a to c and for one 
particle to go from b to d. 

Since the particles are indistinguishable, their symmetry properties 
imply that the amplitude to find the two particles finally at the points 
c and d must be a symmetric function of c and d. That is, the wave 
function w(c,d) must be a symmetric function of the variables Xor Rai 
That is 


(c,d) = ¥(d, c) (10.79) 


If the particles were Fermi, the wave function would have to be an an- 
tisymmetric function of these positions. 
If many particles are involved, the rule is simply extended, that is, 


6(1,2,3,...,N) = $(1,3,2,...,N) 
aad e(1, 2, 4, ne on oe N) 
= etc. (10.80) 


The simplest statement of the general rule is that the wave function 
must be symmetric (antisymmetric for Fermi particles). Although other 
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solutions of Schrédinger’s wave equation exist, only symmetric and an- 
tisymmetric ones appear in nature. Hence in the sum defining the par- 
tition function in Eq. (10.2), we do not wish the sum over all energy 
eigenstates of the hamiltonian H which can be obtained from solution 
of Hon = Endn, but only over those for which the wave function @n is 
a symmetric function. For example, the density matrix p(xz’,x) is de- 
fined by Eq. (10.28) with a disregard for the statistics of the N atoms 
involved. How can we reduce this sum to include only symmetric wave 
functions? 

To accomplish this reduction, we use the following trick. First we 
notice that for any function a symmetric function can be produced sim- 
ply by permuting all variables and adding together the resulting func- 
tions. Thus for any function f(z1,2%2) the combination f(x , 22) + 
f(%2,2%1) is a symmetric function. It follows that for any wave func- 
tion o(@1,22,...,£N) the function 


$' (ti) = >_ o(P2;) | | (10.81) 


is symmetrical. Now if ¢,(z;) is a solution to the Schrödinger equation, 
then ¢ (xi) as defined by Eq. (10.81) is also a solution, since the hamil- 
tonian H is symmetric for an interchange of coordinates. Therefore, 
each interchanged form ¢,(Pz) is a solution, as is the sum. 

Some of the energy eigenvalues En have eigenfunctions n which are 
symmetric, and some do not. Suppose Ek is an energy eigenvalue for 
which the Schrodinger equation does not have a symmetric solution. 
Then the sum XN on(Pz) must vanish, since if it existed it would be 


a symmetric sei hon for Ek. This result implies that the operation 
defined by Eq. (10.81) selects just those solutions to the wave equation 
which are symmetric. All other solutions vanish. If ¢,(x) is symmetric, 
then it is equal to ¢,(Pzx); and since there are N! ways of permuting 
the N atoms, we have | 


_ J Nlon(z) if dn is symmetric 
2 a 7 l 0 if On is of any other symmetry (10.82) 


These results give us an answer to our question. We can now select 
out of the sum defining the density matrix those particular elements 
which apply to symmetric states. Thus 
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all 


yp (Pa st) = 2a da balPa') on (ze P™ 
P 
os 


= NI) $,(a’)o%(2)eF" 
= N! psym(2', £) (10.83) 


This is the reason why in Eq. (10.77) defining the partition function for 
symmetric statistics we permute all the particles and divide by N!. The 
resulting partition function corresponds to 


sym 


J Poya; zo) d?N £o = Zsym = `S e Pen (10.84) 


n 


We note some of the features of Eq. (10.77). At high temperatures, 
we should expect a classical solution for the partition function with 
no quantum-mechanical effects in evidence. Suppose we disregard the 
effects of the potential for the moment and consider the effect of the 
motion of an atom from its initial point to some other point a distance 
d away. In the path integral of Eq. (10.77), this is a motion from the 
initial point x; (0) to the permuted position Px;(0), and the contribution 
of that particular permutation to the sum over all permutations is pro- 
portional to exp{—mkTd?/2h7}, thus decreasing with increasing tem- 
perature or increasing spacing between atoms. Hence, unless the atoms 
are extremely close together, no permutation in the sum is important 
— even the simplest interchange between two atoms — in comparison 
with the identity permutation which leaves all atoms in their original lo- 
cations. If we include the effects of the potential which increases steeply 
at a radius of 2.7 A from the center of an atom in liquid helium, then 
no configurations in which the atomic spacing is less than this value are 
important. 

Since only the identity permutation makes a significant contribution 
to the summations, all that remains for our consideration is the factor 
1/N!. In the early days of classical statistical mechanics it was realized 
that such a factor was necessary when dealing with identical particles, 
but its significance was not clearly understood. Its effect on the chemical 
constant is called the entropy of mizing when systems of several different 
kinds of atoms are studied. 

As the temperature falls, the exponential factor exp{—mkTd? /2h*} 
prejudicing against migrations to new final positions becomes smaller 
and smaller. This means that at extremely low temperatures new terms 
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become important in the summation over permutations. Of course, the 
quantum modifications must be included; and we saw they could be 
included as a first approximation by replacing the potential V with an 
effective potential U. As the temperature falls, the specific heat of liquid 
helium begins to rise slightly near about 2.3 or 2.4 K. 


Problem 10-8 The density of liquid helium is 0.17 g/cm?. Give an 
order-of-magnitude estimate of the temperature at which permutation 
terms should begin to play an important role in the description of liquid 
helium. 


At first sight, one would not expect very elaborate exchanges of 
atoms to ever be important. An exponential factor involving the spacing 
must be included each time an atom moves to its neighboring location. 
If we call this factor y, then for r atoms to move to neighboring spots 
the factor y” must be included, and since y is certainly less than 1 at 
any temperature, y” could become quite small for large r. We certainly 
would think that as r approaches any reasonable fraction of the approx- 
imately 10? atoms in a cubic centimeter of liquid helium, contributions 
from factors like y” must be infinitesimal. However, this first sight does 
not take into account the fact that with r atoms permuting, there is 
an enormous number of possible permutations, namely r! œ e™@=r-1), 
Thus the small weight of one particular permutation is offset by the 
large number involved. | 

Another question which arises in the description of liquid helium con- 
cerns the type of permutations which are involved. Any permutation can 
be described by cycles; thus 1-4, 4-7, 7-6, 6-1 is a cycle. Are the im- 
portant cycles long or short? A careful estimate shows that at moderate 
temperatures, only simple exchanges of two atoms are important. Then 
as the temperature falls, cycles of three atoms become important, then 
four, and so on. But then suddenly, at a certain critical temperature, 
cycles of much greater length L offset by their great number the small 
value of y”. At this temperature cycles of importance become very 
long, involving nearly all of the atoms inside a container. At this point 
the curve of specific heat vs. temperature shows a discontinuity. Below 
this temperature the behavior of the liquid is very strange. It flows 
through very thin tubes without resistance for low velocities. It simu- 
lates infinite heat conductivity in bulk, etc. These odd characteristics 
are manifestations of quantum mechanics, particularly the constructive 
interference between amplitudes for replacing one atom with another. 
Quantitatively, the details of the behavior of the specific heat just at 
the transition temperature are not on a very firm foundation. But the 
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qualitative reason for the transition is clear.’ 

The expression analogous to Eq. (10.77) for Fermi particles, such as 
*He, is also easily written down. However, in the case of liquid *He, 
the effect of the potential is very hard to evaluate quantitatively in an 
accurate manner. The reason for this is that the contribution of a cycle 
to the sum over permutations is either positive or negative depending 
on whether the cycle has an odd or even number of atoms in its length 
L. At low temperature, the contributions of cycles such as L = 51 and 
L = 52 are very nearly equal but opposite in sign, and therefore they very 
nearly cancel. It is necessary to compute the difference between such 
terms, and this requires very careful calculation of each term separately. 
It is very difficult to sum an alternating series of large terms which are 
decreasing slowly in magnitude when a precise analytic formula for each 
term is not available. 

Progress could be made in this problem if it were possible to arrange 
the mathematics describing a Fermi system in a way that corresponds 
to a sum of positive terms. Some such schemes have been tried, but the 
resulting terms appear to be much too hard to evaluate even qualita- 
tively. 

For molecules which are separated by distances in the neighborhood 
of 1 A we have seen that the effects of exchange (the nonidentical per- 
mutations) are important only when the temperature is down to a few 
degrees absolute. In contrast to this, consider the behavior of electrons 
in a solid metal. The mass of the electron is so much smaller than 
that of a molecule that the critical temperature is much higher. At 
room temperature, electrons in a metal are described accurately only 
by equations which include the exchange effects of these cyclic permu- 
tations. From this point of view, room temperature is very cold for 
electrons. The exchange effects are of dominant importance, or, to put 
it another way, the electron gas is degenerate. Of course, the electrons 
interact by Coulomb’s law, which is quite strong. But since the effects 
of the Coulomb attraction are of long range, they tend to average out. 
To a fair approximation, the electrons act as if they are independent, 
although, of course, each moves in the same periodically varying poten- 
tial produced by the arrangement of the nuclei and the average of the 
positions of neighboring electrons. From the study of the ideal Fermi 
gas neglecting interactions, we can learn a lot about the behavior of 
electrons in metals. 

However, it is apparent that we cannot learn quite enough, for the su- 


1A more detailed discussion of the partition function of liquid helium from this 
point of view may be found in R.P. Feynman, Atomic Theory of the A ‘Transition in 
Helium, Phys. Rev., vol. 91, pp. 1291-1301, 1953. 
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perconductivity of metals occurring below a few degrees absolute would 
remain a mystery. This phenomenon, in some metals at least, involves 
an interaction in which the slow vibratory motion of the atoms is in- 
volved. We conclude this because the transition temperatures for two 
different isotopes of the same metal depend on the atomic mass. This 
value of the isotopic mass would not be important if the transition were 
simply a matter of mutual interaction between electrons, or interaction 
of the electrons with an idealized array of fired atoms. The idealization 
that the atoms are fixed must be incorrect. But how does the motion of 
the atoms produce a sudden jump in specific heat in some metals and 
permit electrical conductivity without resistance below this tempera- 
ture? This question was first answered in a convincing way by Bardeen, 
Cooper, and Schrieffer.' The path integral approach played no part in 
their analysis, and in fact it has never proved useful for degenerate Fermi 
systems. 


The Planck Blackbody Radiation Law. The partition function 
for any system of interacting oscillators is easily worked out. Such a 
system is equivalent to a set of independent oscillators of frequencies w;. 
However, the value of the free energy F for independent systems is the 
sum of the values of F for each of the separate systems, which we find 
directly from the sum of Eq. (10.2) to be 


Aw 
? Peale 
KD In ( sinh a) 


This gives the free energy of a linear system as 


pe kT DS jä (2 sinh a) 


1 





The last term in this expression is a ground-state energy of the system. 

For an electromagnetic field in a box of volume V, the modes are 
specified by the vector wave number K, two for each K. The zero-point 
energy is omitted. Thus the free energy of the electromagnetic field per 
unit volume is 


FE PK 
— = kT | 2ln(1 — ee hKes/kT) 2 
7 H n(l—e ) (Ons (10.86) 


1J. Bardeen, L.N. Cooper, and J.R. Schrieffer, Theory of Superconductivity, Phys. 
Rev., vol. 106, pp. 162-164, 1957 and vol. 108, pp. 1175-1204, 1957. 
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The internal energy U is the partial derivative of GF with respect to 6 
which becomes (putting w = Kc) 


U | Aw dK 
etl aea Ori re 
The volume element in K space can be written as 
l 2 
dK = 4r K’ dK = 4n— dw (10.88) 
C 


This means that the energy density in the electromagnetic field in the 
range of frequencies from w to w + dw is 


2-47 hw? 
(27e)? ehw/kT _ 1 


This is the famous blackbody-radiation law discovered by Planck. It was 
the first real quantitative quantum-mechanical result discovered and was 
the first step in the discovery of the new laws. | 

Another early quantum-mechanical triumph was the explanation of 
the temperature dependence of the specific heat of solids by Einstein 
and by Debye. This also comes from Eq. (10.85), but the oscillators 
are now the normal modes of the crystal, as described in Chap. 8. For 
example, the thermal energy per unit volume of such a crystal is, like 
Eq. (10.87) (leaving out the zero-point energy), just 


dK 
= 2 | moe 1 (27)? (10.90) 


3p modes 


dw -= (10.89) 


where w(K) is the frequency of a phonon of wave vector K. In a crystal, 
this is a multiple-valued function (there are 3p values for each K if there 
are p atoms in a unit cell), and we must sum over each of the possible 
values of w for each K. The K integral extends only over the finite range 
proper for the crystal. For light there are two modes for each K, each of 
frequency w = Ke, so the sum gives a factor 2 and Eq. (10.87) results, 
the integral on K now going to infinity. 

The result of Eq. (10.90), studied in various approximations by Ein- 
stein and Debye, gave a good accounting of the main features of the 
specific-heat curve, particularly the behavior at low temperatures, which 
had been in direct contradiction to the classical expectations. Today, 
putting a more complete knowledge of the phonon spectrum w(K) into 
Eq. (10.90) yields a completely satisfactory description of that part of 
the specific heat of solids due to internal vibration of the atoms. 
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REMARKS ON METHODS OF DERIVATION 


The presentation of statistical mechanics given in the early part of this 
chapter leaves much to be desired. The fundamental law which shows 
that the probability for finding a system in an energy state with energy 
E is proportional to e~£/*" is usually derived by considering the inter- 
action of complex systems over long periods of time. But an entertaining 
problem presents itself. 

We started our discussion of physics in this book by expressing the 
laws of quantum mechanics in terms of path integrals (Chap. 2). Just 
as a question of curiosity, let us take the point of view that this is 
the fundamental law. Then ultimately these statistical properties of a 
system whose quantum-mechanical properties are defined by such a path 
integral are found to be expressible in terms of the partition function 
Z. This function can be defined by a path integral of an obviously 
very similar and closely related form, as shown by Eq. (10.77). Yet the 
derivation of this result requires noting the wave equation, the existence 
of stationary states and eigenvalues, and the argument about interaction 
over long periods of time to which we referred, all of which leads to 
the expression (10.2) for the partition function in terms of the energy 
levels £,,. Finally, we proceed to the reverse argument producing the 
path integral formulation for Z. Is there any way to derive the path 
integral expression for Z for a system in equilibrium directly from the 
path integral description for the time-dependent motion? Can we find 
a short cut which avoids the mention of energy levels altogether? If it 
is possible, we do not yet know how to do it. 

One might ask: Why try it at all? It is like showing that you can 
swim with your hands tied behind your back. After all, you know there 
are energy levels. The only excuse for trying to avoid their mention 
would be that in so doing a deeper understanding of physical processes 
might result or possibly more powerful methods of statistical mechanics 
might be evolved. At any rate, it would be interesting to solve the 
problem. | 

It was the promptings of a similar quest, to get the well-known varia- 
tional principle for the lowest energy level directly from the path integral 
formulation (instead of indirectly via the Schrodinger equation), which 
resulted in the methods described in Chap. 11. Thus the results of 
this apparently academic problem were of Some use as well as of some 
interest. 
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Nevertheless, if we prefer, we can suppose our desire for one partic- 
ular course in achieving a solution is prompted simply by an academic 
interest in the methods of classical physics. Suppose that we have a 
system obeying the principle of least action, with the action defined by 


k 
s=7 [#0 d- J alt a) dt (10.91) 
so that the equation of motion is 
k 
mat) = -zle +a)+ z(t- a)l (10.92) 


Here we have created the curious situation in which a particle is driven 
by a force depending on the average value of coordinates that were and 
that will be. There are exponentially exploding solutions of Eq. (10.92), 
but let us say that only motions for which x remains finite both in the 
distant past and in the distant future will be allowed. (Incidentally, it 
is likely that solutions which we wish to ignore are excluded anyway if 
the action law is stated as 6S = 0 for all variations of path dx subject 
to the restraint 6x — 0 for t — +00.) 

For such a system it is possible to define an expression for energy 
which is conserved; for the equations of motion of the system do not 
depend on time. (No simple hamiltonian gives the equations of motion.) 
Presumably, such a system could possess properties which allow it, for 
example, to be perturbed by molecules of a gas and thus achieve thermal 
equilibrium. We might ask: What are the averages of various quantities 
describing a system obeying the equations of motion of Eq. (10.92) and 
appropriate boundary conditions at infinity when it is in equilibrium at 
the temperature T? Perhaps such a problem is not definable, or perhaps 
it is easily solved only in this special case because the equations of motion 
are linear. But the aim of these remarks is to ask whether the existence 
of a hamiltonian and momentum variables is indeed necessary to the 
formulation of classical statistical mechanics —- or whether a wider class 
of mechanical systems can be analyzed, a system in which the equation 
of motion comes most simply from the principle of least action, even 
though that action involves more than the instantaneous positions and 
velocities of the particles in the system. 

This question is the classical analogue of our more interesting ques- 
tion, namely, how do we proceed directly from the path integral formu- 
lation of quantum-mechanical laws of a mechanical system to the path 
integral formulation of statistical mechanical laws for the same system 
in equilibrium? 
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Problem 10-9 Show that the expression 


m k t+a 
Et ze) + zrel + a) — z], r(t —aji(t) dt (10.93) 


defines a conserved energy for the equation of motion (10.92). 


In general, for any action functional, like S, that does not involve the 
time explicitly (i.e., is invariant for the transformation t — t + const) 
there is an expression E (T) for the energy at time T which is conserved. 
It can be found by asking for the first-order change in the action S when 
all paths are changed from z(t) to x(t + n(t)), where n(t) = +€/2 for 
t<T and y(t) = —e/2 for t > T, with constant e. 6S is then eE(T) for 
infinitesimal e. 


Problem 10-10 Discuss the problem of the path integral formula- 
tion of statistical mechanics for a particle in a time-constant magnetic 


field. 
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The Variational Method 
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IN this chapter we discuss a method based on a variational principle 
for the approximate evaluation of certain path integrals. First, we shall 
illustrate the method by some examples. Later, we consider those prob- 
lems for which the method may be useful. 


A MINIMUM PRINCIPLE 


Suppose we wish to evaluate the free energy F of a system. This problem 
can be expressed in terms of path integrals by starting with the partition 
function for the system defined in Eq. (10.4) as 


Z =e PF (11.1) 


In Eq. (10.30) the partition function was expressed as an integral of 
the pe matrix p(z,x). Then, in Sec. 10-2, a kernel expression for 
p(x, 2) was developed. It allows us to write 


z= f f e°/? Da(u) dza (11.2) 


so long as we use the “time” variable u in the way described below 
Eq. (10.43). (Also, S here is the negative of the functional S used in 
Chap. 10.) 

In Sec. 10-3 we developed a perturbation technique for the evaluation 
of the path integral defining the partition function for certain special 
cases. We shall now describe another technique, applicable in those 
cases where S is real. For ordinary cases without a magnetic field (and 
no spin) S is real. 

Throughout the remainder of this chapter, we choose units in which 
the value of Å is 1. Whenever it is necessary to include A symbolically in 
order to visualize the quantum-mechanical character of a result, it can 
be so included by a straightforward dimensional inspection. 

Let us suppose that some other S’ can be found which satisfies 
two eae First, S’ is simple enough that expressions such as 
fe Da(u) or f Ge’ Da(u), for simple functionals G, can be evalu- 
ated. a the important paths in the integral f e” Dg(u) and those 
in the integral f e’ Dgz(u) are similar, that is, S’ and S are similar when 
they are both large. Now suppose F” is the free energy associated with 
S’. That is, 


CO Phe 
ere =| / e” Dx(u) dza (11.3) 


so that 


e° Da(u) dza eit 
tess Boia dza m | a 
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S — eS—S' eS" we can write Eq. (11.4) as 


ff S-S oS" Dgz(u)dza _ _6(F-F’) 
ff eS Da(u)dta Í 


This says simply that e78 -F °) is the average value of e” ~S" where this 
average is taken over all paths with the same initial and final point and 
the weight of each path is eS. All possible values of x, are included in 
the averaging process. 

One way to proceed now would be to suppose that S — S” is small and 
that F — F” is small and then expand both sides up to the first power 
in their respective exponents. This method appears dubious because 
G(F — F”) is not small if 8 is large. However, comparison of higher- 
power terms shows that this is nevertheless a legitimate approximation 
to B — F. 

The argument can be made much more rigorous and powerful in 
the following way. The average value of e” when z is a random variable 
always exceeds or equals the exponential of the average value of z, as long 
as © is real and the weights used in the averaging process are positive. 
That is, 


(et) > e (11.6) 


Then since e 


(11.5) 


where (x) means the weighted average of x. This follows because he 
curve of e7 is concave upward, as shown in Fig. 11-1, so that if a num- 
ber of masses (weights) lie along this curve, the center of gravity of 
these masses — the point with coordinates ((z), (e”)) — lies above the 
curve. The vertical height of this center of gravity is the average vertical 





(2) | T 


Fig. 11-1 We assume the weighing factors a; are positive and look on them as different masses 
positioned along a string. Then the exponential of the weighted average of æ, that is, elt), 
must lie below the weighted average of the exponentials (e”) because of the concave nature of 
the curve e”. The value of e‘*) must lie on the curve, but (e%), the center of gravity of the 
several points, must lie under the dot-dashed line and above the curve. 
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position (e*) of the points. It exceeds e‘*’, the ordinate of the curve 
e7 at the abscissa position of the center of gravity, which is the average 
value (2). 

On the left-hand side of Eq. (11.5) we take the average value of e?~° | 
over paths with the positive weight eS, where S’ and S are real. Hence, 
by Eq. (11.6), this exceeds the quantity e($799, where (S — 9’) is the 
average a S — S’ with this same weighting scheme, namely, with the 
weight e” . That is, 


ffs = Se = ) dit, 


=$\= 11.7 
S ff e” Da(u) dag m 
We have then 
Paa dee ae > e(S—-S") (11.8) 
This result implies that 
1 
ca ees (11.9) 
Our final result is then 
F< F’—6 (11.10) 
where 
S — S'es D 3 
ô = i JJS- Se? Da(u) dza (11.11) 


B ffe” Delu)dzra 


It is very fortunate that we have a minimum principle here. It says 
that, if we calculate F” — 6 for various “actions” S’, that calculation 
which gives the smallest result is nearest to the true free energy F.t 
The energy F is actually obtained, of course, if S’ = S; but we can 
guess that if S and S differ in some sense to a first order of smallness, 
then the deviation of F’ — 6 from F must be of second order. 

If only a reasonable general form of S” can be guessed but certain 
parameters still remain uncertain, the calculation of F” — 6 can be made 
leaving these parameters undetermined. Then the nearest approxima- 
tion to F will be the lowest F” — 6 available. That is, the “best” values 
of the parameters are those which minimize F” — 6, “best” in the sense 
that the resultant F” — 6 differs least from the true F. 


'It is worth emphasizing again that neither S nor S’ is an action functional in 
the proper physical meaning of the term, since both are defined with the variable u 
used as the “time” variable. However, operations with path integrals are the same 
for these functionals as for proper physical actions used previously. 
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This same minimum principle can be used to find an approximate 
value for the lowest energy state of the system, Ey. Recall that 


ee -5 er PEs (11.12) 


As the EO e of the system becomes lower and lower, that is, as 8 
grows larger and larger, terms involving higher values of energy become 
less and less important in this series. Eventually the series for Z is 
dominated by the term of smallest energy, e~°”°. That is, 

lim Z = e PF (11.13) 
8—0 

Now following the line of argument developed in the preceding para- 
graphs, we can simply replace F' with Eo. We define Eù as the result of 
the path integral involving the new action S’ and finally derive 


Eo < Eb — ô (11.14) 


as an approximation in the limit of large 8. 

In approximating Ho by this technique, our task is somewhat simpler 
than it was for the free energy F. Specifically, we can disregard the 
specification that the initial and final points of the paths be the same. 
To understand this, we refer back to Eq. (10.28) and note that as 8 
becomes large the density matrix p(x’, x) is also dominated by the zero- 
order term and approaches ¢o(z')¢%(x)e~ °°. Thus the dependence on 
x’ and x enters into a multiplying factor but does not affect the nature of 
the exponential behavior of the function. It is this exponential behavior 
which is fundamental in the evaluation of Eg by this technique. 


AN APPLICATION OF THE VARIATIONAL METHOD 


As an example of the evaluation of a partition function using this varia- 
tional principle, consider the example of a single particle constrained to 
move in one dimension. Using the approach developed in Chap. 10, we 
write the action for such a particle as 


S = Sa [Zi u) n V(2(u))| du (11.15) 


0 
So the partition function is 


-F foe} -[ [54 (u (u)+V (z(u )) 7 Dau) dza 


(11.16) 
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This path integral is over paths which return to the initial start- 
ing points; and after the path integral has been evaluated, a further 
integration over all possible starting points is carried out. 

In Sec. 10-2 we considered this same problem and pointed out how the 
classical approximation may be derived by inspection. In the classical 
limit of high temperatures, or high values of kT compared with A, the 
value of Gh is so small that paths which get very far away from x, do not 
contribute. Thus, the potential can be replaced by the constant value 
V (£a), and the path integral contributes only a constant, giving 


Z classical = = ~PFelassical — = 4) a 5 nfe FRIE) dx (11.17) 
Tp ` 


as shown in Eq. (10.48). 

In Sec. 10-3, one in -mechanical improvement was made on the 
classical result by expanding the potential about the average position of 
the path and using terms up through the second order in this expansion. 
Then a still greater improvement was achieved by using an effective 
potential U, developed through a particular averaging process. From the 
point of view of this chapter, we see that that approach was a special 
application of the variational method. To clarify this point, we shall 
review the key steps using the notation and concepts of this chapter. 

Thus we wish to derive a suitable trial function W(Z), a substitute 
for the potential, where Z is the average position of a path defined by 


T= + few du (11.18) 


Along any particular path, this W(Z) is a constant, so that the new form 
of the action along that path becomes 
B 
/ nm -2 _ 
S zu] = -5 | Ż (u) du — BW (3) (11.19) 
0 

With this more general form, it is possible to calculate both F” and 
(S — S$’). 

Proceeding along this course, we use Eq. (11.11). Substituting into 
this expression, we have 


| J > [ V (a(u’)) du’ -we e9 le) Dgr(u) dag 
J 7, S'iea) Delu) dza 


(11.20) 
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where 
/ m A 
e5 z(u) — exp -2 f t° (u) du > exp {—8W/(z)} 
0 


It is to be remembered that the paths to be used in the path integrals of 
Eq. (11.20) are those which have the same initial and final points, and, 
as in Eq. (11.16), a further integration over all end points z, is to be 
carried out. 

Note that the numerator of ô is quite similar to the term I(Z) intro- 
duced in Eq. (10.64), if we restrict ourselves to paths that have a specific 
average value Z and count on integrating over all possible values of 7 at 
a later stage of the calculation. By the same arguments as were used 
in the discussion of [(Z), we see that the numerator of ô is independent 
of u’. We can evaluate the path integrals in both numerator and de- 
nominator by the methods used in Chap. 10 and take the answer from 
Eq. (10.65), remembering that 


Y=2,-2 (11.21) 


Since the denominator is simply a special form of the expression appear- 
ing in the numerator, the result is 


“L fiw W(z)Je?%*) dé dza 
J. r e7 (2a:2) dz dTa 
where 


Fei — exp f- a 7 ay exp {-BW(a)} 


(11.22) 


The integral over £a in the denominator in Eq. (11.22) can be eas- 
ily evaluated to give (7G/ 6m)}/ 2. Furthermore, the integral over the 
term in the numerator containing the factor W(Z) results in this same 
multiplying constant. It will be more convenient for our future work if 
we carry out that particular integration in the numerator and further 
simplify the resulting expression by defining the function V(Z) as 


V (T) -J5 f v (20) exp | -a-a dTa (11.23) 


The form of V(Z%) reveals the quantum-mechanical effect we have 
introduced. This function is a weighted average of V (xa) with a gaussian 
weighting function just like the function U (za) defined in Eq. (10.68), 
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and the gaussian spread is again (Gh*/12m)!/*. For a helium atom at 
a temperature of 2 K, this spread amounts to about 0.7 A. At room 
temperature, however, it is only about 2 per cent of the 2.7-A diameter 
of the atom. The value of 6 can now be written as 


L W (2) -V@) e OWES) az 


J eo BW (E) dz 


The next step is to evaluate W (z) by the requirement that we obtain 
the minimum value for F” — 6, as shown in Eq. (11.10). F” is given by 


e AF — J J e9 Da(u) dza (11.25) 
CO pha 8 
= J J exp 1-3 | t’ du — wo] Da(u) dza 
—wd Ta 0 
Bos. «kB m re 
af J eewo | exp -F | t? du D'x(u) dT drag 
-QJ -A z fixed 2 0 


The path integral is a simple one (see Eq. 11.17) whose value we know 


to be ,/m/278, so that we obtain 


-0r |I [7 PWE ga 
e a6 fi Ax (11.26) 


The next step, finding the optimum choice for W (z), requires us to 
determine the effect of a small variation in the function W(z) on the 
value of F” — 6 and set this effect equal to 0. Thus, imagining W to be 
replaced by 
W > W(z) + 7(2) (11.27) 
we find from a (11.26) that the variation in F” is 

Jn e SW (E) dz 


ô = (11.24) 


se Ta OFE (11.28) 
and from Eq. (11.24) that the variation in 6 is 
| {n@ -na [W@ -V@]} "© ae 
i E ee ee (11.29) 


J BW (8) az 


J W (2) -V@) gee) az | Bola)" dz 


E 2 
(/ eo PW (2) ia 


+ 
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Finding a stationary value for the right-hand side of Eq. (11.10) requires 
simply that 


OF’ —06=0 (11.30) 
which will be true if we take 
W (%) = V(Z) (11.31) 


This, in turn, implies that ô is zero and that the upper bound on F has 
the same form as the classical free energy of Eq. (11.17). However, the 
potential, in this upper bound, has been replaced by V(%). That is, 


-or |" [7 -O ag 
e 2 ang J e Ax (11.32) 


where V(Z), the effective classical potential, is given by Eq. (11.23). 
For large values of 6, the free energy is essentially the same as the 
lowest energy level Eo; thus we can interpret Eq. (11.32) as providing 
an approximation to Eo. This means that the variational approach has 
produced the same result as that obtained in Chap. 10 and shown in 
Eqs. (10.67) and (10.68). 





THE STANDARD VARIATIONAL PRINCIPLE 


There is in quantum mechanics a standard variational principle, called 
the Rayleigh-Ritz method, which is this: If H is the hamiltonian of the 
system, whose lowest energy level is Ło, then with f representing any 
arbitrary function in configuration space I’, 


- S FHF aD 
fae 


This has wide application and is very easily demonstrated. If the func- 
tion f is expanded as a series in the eigenfunctions ọn belonging to the 
hamiltonian, i.e., if f = J, @ngn, it is evident that 


SEHE Vi an| En (11.34) 


A aa 
This latter expression is an average of the energy values (with posi- 
tive weights |a,|*) which therefore exceeds (or equals) the least value 
Eo. The principle expressed in Eq. (11.33) has characteristics similar 
to the principle of Eq. (11.14). In fact, Eq. (11.33) is a special case of 
Eq. (11.14). (To be more precise, we should restrict this conclusion to 


Eo (11.33) 
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those cases for which the hamiltonian H is derived from a lagrangian 
which does not contain any magnetic field. Under this restriction, then, 
the conclusion holds.) To see the relation between these two equations, 
we shall consider the following example: 
Suppose the action S is connected with a lagrangian such as 
Lee 


L= > # — V (2) (11.35) 


where V (x) is independent of t. (Otherwise, of course, there are no fixed 
energy levels to seek!) We shall limit ourselves to the case of a single 
variable x, but the general case follows directly. We note here that if 
the lagrangian contains the term tA — for example, if the lagrangian 
represents a particle in a magnetic field — then Eq. (11.33) is still cor- 
rect. However, the action S is complex. In this case we suspect that 
Eq. (11.14) (or some simple modification of this equation) is still valid. 
However, this has not been proved. So, for the present we shall limit 
our discussion to a case in which no magnetic field is present. Then in 
the limit for large values of G we have 


e PEo y Jœl- [ t’ (u) du — [ Vedu} Da(u) (11.36) 


Now suppose we use for our trial action S’ the form 


/ m E ser 
S S l L udu- | V'(x(u)) du (11.37) 


which involves some other potential V’(x). This means that 


p 
fada j V"(ar(u)) — V(a(u))} de (11.38) 


p / 
, lal [V'(z(u))— V (z(u))| du e° Do) 


fe Da(u) 


If we were to define the mean value of any function which depends on 
the path z(u) in such a manner as this, we would find that the value is 
nearly independent of u so long as u was not too close to either 0 or £. 
Therefore, to a sufficient approximation, we can write 


_ SVED) -VEe Dau) re ' 
6 a = Va) -vea 


(11.39) 
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where & is a “representative” value of u between 0 and 8. Following 
the methods developed earlier, we can evaluate this path integral if 
we assume that the energy values E’, and the energy functions œ (x) 
belonging to the S are known. If our path goes from za to £p, for 
example, we obtain® 


Y OE m Gn (0) fone" "5 (a) 


fea = (11.41 
— ye PEs (24) (te) “ue 


fmn = mL) F (2) 6 p(x) dx (11.42) 


— CO 


But, if 2 approaches infinity and ŭùŭ is likewise large (for example, 
ŭ = 8/2), all the higher exponentials are negligible compared to the 
exponential involving the lowest energy term /4. Thus in the limit 


pe (f) = foo (11.43) 
This result can be written as 
s= POVES- So(x)V(w)o'o(e)de (11.44) 


Of course, to use Eq. (11.14) we must subtract this value from Ep. 
If H’ is the hamiltonian associated with S’, that is, if 


2 
P 
H' = — +V” 11.45 
= + ¥"(2) (11.45) 
then 
MESE) (11.46) 
so that 


Ry- ò= | {He ode+ | PVPode- | PV'Hode (11.47) 
But the true Hamiltonian can be written as 


2 2 
H=*-4+Vs4V4vV-W=H4V-V' (11.48) 
2m 2m 


and this means that 


Eo < | (E)E ole) da (11.49) 
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where ¢’,(x) is normalized and is the wave function corresponding to 
the lowest energy state of the hamiltonian H’. The estimate of the 
lowest energy level given in Eq. (11.49) involves the arbitrary potential 
V’(x) only through the wave function ¢’)(a). Since this potential was 
arbitrary, so is the wave function ¢’)(x). Therefore, instead of choosing 
an arbitrary potential and finding from it the resulting wave function 
and then proceeding to evaluate Eq. (11.49), we could instead pick the 
wave function itself and then evaluate Eq. (11.49) without ever bothering 
about the potential to which this arbitrary wave function belongs. The 
variable function in this process is then the wave function ¢’)(x) rather 
than the potential function V‘(z). We find, then, that this result is 
simply another way of stating the Rayleigh-Ritz method Eq. (11.33) 

If the problems such as the one given in this example were the only 
ones in which the concept expressed in Eq. (11.14) were useful, then 
there would not be much point to this long discussion. But there are 
much more complicated integrals for which Eq. (11.14) can be used in 
a way that, at least as far as we can tell, is not so easily transformable 
into Eq. (11.33). We shall describe such an example in the next section. 
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We imagine an electron moving in a polar crystal, such as sodium chlo- 
ride. The electron interacts with ions, which are not rigidly fixed. ‘Thus, 
the electron creates in its neighborhood a distortion of the crystal lattice, 
and if the electron moves about, the region of distortion moves with it. 
This electron, together with its distorted environment, has been called 
a polaron. 

One consequence of the lattice distortion is that the energy of the 
electron is lowered. Furthermore, since as the electron moves the ions 
inust move to adjust the distortions, the effective inertia of the electron 
(or, to use the currently accepted term, the mass of the polaron) is 
higher than simply the mass the electron would obtain if the lattice were 
composed of rigidly fixed points. The precise motion of such a polaron 
analyzed quantum-mechanically is exceedingly complicated. We shall, 
however, make a number of approximations whose justification in the 
real case may be quite difficult. Nevertheless, we shall arrive at an 
idealized problem which has been studied by a number of physicists.” 


R.P. Feynman, Slow Electrons in a Polar Crystal, Phys. Rev., vol. 97, pp. 660- 
665, 1955. 

“For example, H. Fréhlich, Electrons in Lattice Fields, Advanc. Phys., vol. 3, 
pp. 325-361, 1954. References to other works are given in this article. 
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It has been studied not only because of its possible connection with the 
real behavior of an electron and a crystal, but also because it represents 
one of the simplest examples of the interaction of a particle and a field. 
The path integral variational method has been very successful in the 
solution of this idealized problem. 

First, we note that even if the ions were rigidly fixed in the crystal, 
the electron would still move in a very complicated potential. In such a 
case, one can show that there are solutions of the Schrödinger equation 
for the electron with characteristic wave numbers k. The energy levels 
of these solutions are generally very complicated functions of the wave 
number. Nevertheless, we assume that the relation between the energy 
E and the wave number k is still a quadratic form, such as 

2 1.2 
E= us (11.50) 

2m 
where m is a constant (not necessarily the mass of an electron in a 
vacuum). Next, we note that the force which the electron exerts on 
the lattice is such as to push away the negative ions and attract the 
positive ions. The motion of these ions will be analyzed by considering 
them as a set of harmonic oscillators and employing the methods of 
Chap. 8. However, we shall assume that the only harmonic modes which 
we need are those with high frequency, in which ions of opposite sign 
of charge move in opposite directions. The frequency wk of each mode 
then depends on the wave number k of the mode. However, we shall 
neglect this dependence and assume that w is a constant. 

Our object is to find the electrical force generated by a distortion 
characterized by the wave number k and find the interaction of the 
electron in this force. Here, we neglect the atomic structure and treat 
the material of our crystal as simply a continuous dielectric which carries 
waves of polarization. If P is the polarization, written in the form of a 
longitudinal wave 





P= aye? (11.51) 
then the charge density from the ions is 

plr) = -V -P = —tka,e"*" (11.52) 
If the potential is ¢(r), we have 

V*o = —4rp(r) (11.53) 


Thus if gq, is the amplitude of the kth longitudinal running wave, the 
polarization amplitude a, is proportional to gx, and the energy of inter- 
action between the wave of polarization and the electron at x is propor- 
tional to the sum over all values of k of the terms (q,/k)e’**™. 
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Since the energy and the momentum of the electron are related 
through E = p*/2m, we can write the lagrangian of the entire system 
as 


a d 2 _ @2) 2V/ 27a 
= 5x! t g dade = a) - (2e i Dga (11.54) 


The first term of this expression is the energy of the electron in a 
rigid lattice, where x is its position. The second term is the lagrangian 
of the oscillations of polarizations taken alone, where it is assumed that 
all waves of polarization have the same frequency and the coordinate of 
the kth mode is gy. The last term is the lagrangian of the interaction 
between the electron and the lattice vibrations, where V represents the 
volume of the crystal and & is a constant. To simplify writing of all 
our subsequent formulas, we have written this in dimensionless form. 
That is, the scales of energy, length, and time are so chosen that not 
only h but also the common frequency w of the oscillators and the mass 
m of the electron are all unity. The coupling constant a is then the 
dimensionless ratio 


1 1 1 | 
ee e (11.55) 
VI ves € 


where € and € are the static and high-frequency dielectric constants, 
respectively. In a typical case, such as the crystal of NaCl, the value of 
a is about 5. The values of the energy which we shall calculate are in 
units of hw. 

Now we can study the quantum-mechanical motion of the electron, 
solving the motion of the harmonic oscillators completely. For example, 
the amplitude that the electron starts at x, with the oscillators in the 
ground state and ends a time T later at xp with the oscillators still in 
the ground state is 


Go,0(0, a) = [es Dx(t) (11.56) 


where, using Eq. (8.138), 
d?k 


1 pT TE Joey. | | 3 
ee -|2 dt Vv ae ik x(t) ,—ik:x(s) — 
pf, Pa T G nOeenre (On) 


(11.57) 








Performing the integral over wave numbers k gives 


=f apart f RA RI POE (11.58) 


11-4 Slow electrons in a polar crystal 313 


The quantity Go,o(b, a) depends upon the initial and final positions of the 
electron, Xa and Xp, and upon the time interval we are considering, T. 
Since this function is a kernel, it is a solution of the Schrödinger wave 
equation, considered as a function of the time interval T. Therefore, 
we realize that it will contain frequencies in its exponentials which are 
proportional to the energy levels Æ». It is the lowest one of these energy 
levels which we now seek. 

In developing our variational principle, as we have explained, we are 
not interested in the kernel for real time intervals T. Instead, we want 
quantities such as those which appear in Eq. (11.8) for large values of 8. 
By following all the steps leading to Eq. (11.58), it can be readily shown 
for imaginary values of the time variable that the resulting kernel has 
the form | 


K(b,a) = J e5 Dx(t) (11.59) 


where the variable t (previously called u) goes from 0 to @ and 


S J4 2 dt + £ [ff ie (11.60) 
= —— x|“ dt + —= ——_——— ds at 11. 
2 Jo V8 Jo Jo Ix) — x(s)| 


This result is just that which one might expect from the replacement of 
t in Eq. (11.58) by the imaginary time variable —it (previously called 
—iu). Asymptotically, for large values of 8, this kernel becomes propor- 
tional to e~*°, 

We now have a relatively complicated path integral on which to try 
our variational principle. Next, we shall have to choose some simple 
action 5”, which roughly approximates the true action S, and then find 
Eo and ô. 

We note that in Eq. (11.60) the particle considered at any particular 
time! “interacts” with its position at a past time by a reaction which is 
inversely proportional to the distance traveled between these two times, 
and which dies out exponentially with the time difference. ‘The reason 
for this is that the disturbance set up by the electron in the crystal 
lattice in the past takes some time to die out. That is, it takes some 
time for the ions to relax, and during this relaxation period the electron 
still “feels” the old disturbance. 

We shall try an action S’ which has this same property, except that 
instead of involving the inverse distance as a coupling law, the attrac- 
tion will have the geometric form of a parabolic well. This would be a 


l Although t in Eq. (11.60) is not really a time, but an integration variable instead, 
it is useful to think about t as a time, just as we thought of u as a time below 
Eq. (10.43). 
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poor approximation if the distance |x(t) —x(s)| could very often become 
exceedingly large. However, since there is a limited time available be- 
fore the exponential time factor forces the interaction to die out, large 
values of this difference will not make any important contributions to 
the integral. Thus, we shall try 


s=- bx pa Sf f xt (s) Pew l*-$! ds dt (11.61) 


The constant C is a measure of the strength of the attraction between 
the electron and the previously created disturbance. We take this as 
an adjustable parameter. Furthermore, we can with no extra difficulty 
permit the exponential cutoff law to contain the adjustable parameter 
w, which may differ from unity. With this extra parameter we can 
partly compensate for the imperfection which we have introduced by 
replacing the inverted distance effect by a parabolic effect. (We also 
note in this regard that adding an extra constant to the parabolic term 
x(t) — x(s)|? leads to no further freedom, since such a term would drop 
out in evaluating a formula for Ej.) We shall adjust variable parameters 
C and w later in the evaluation in order to make Eo a minimum. 

Since the action S we have picked is quadratic, all of the path in- 
tegrals which result are easily worked out by the methods described in 
Sec. 3-5. 

5 comparing Eqs. (11.60) and (11.61), we find that 


§ = z8- S’) (11.62) 


veld (maa )— x( ay) oe ase 
+s / T (Ix(t) = x(5)[2)e -2l ds dt 


=Á+B 


We shall concentrate our attention on the first term on the right-hand 
side of this equation, A. In this term we can express |x(t) — x(s)|7} by 
a Fourier transform. As a matter of fact, this term is the result of the 
Fourier transform involved in the step between Eqs. (11.57) and (11.58). 
So we have 

1 3k 
oat T | elik) -x i (11.63) 


For this reason we need to study 


exp{ik-|x(T) — x(a) tes Dx 
(exp{tk-[x(r) — x()|}) = ads 8 ad en = Soe A (11.64) 
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The integral in the numerator is of the form 


I= Jæ |ž|? dt -S f f" t (I e a ds dt 


+f f(t): soal Dx(t) (11.65) 


where specifically 
f(t) = ikd(t — T) — ikd(t — o) (11.66) 


Now we shall evaluate Eq. (11.65) in so far as it depends on f or k aside 
from a normalization factor which drops out in Eq. (11.64). Inciden- 
tally, let us notice that the three rectangular components separate in 
Eq. (11.65) and we need consider only a scalar case. The method of in- 
tegration is the same as that introduced in Sec. 3-5 for the evaluation of 
gaussian path integrals. Thus we substitute X(t) = X(t) + Y(t) where 
X(t) is that special function for which the exponent is maximum. The 
variable of integration is now Y(t). Since the exponent is quadratic in 
X(t) and X(t) renders it an extremum, it can contain Y(t) only quadrat- 
ically; so Y (t) then separates off as a factor not containing f, which may 
be integrated to give an unimportant constant (depending on 8 only). 
Therefore, within such a constant 


roof af Poa-§ [f'n —rorerotwa 
+f T -= (11.67) 


where X(t) is that function which minimizes the expression (subject 
for convenience to the boundary condition X (0) = X (8) = 0). The 
variational problem gives the integral equation 





dt? 
Using Eq. (11.68), Eq. (11.67) can be simplified to 


I = exp GS f(t)X(t) 5 (11.69) 


We need merely solve Eq. (11.68) and substitute into Eq. (11.69). 
To do this, we define 


PXO oc [xe (s)Je~wlt-S| ds — f(t) (11.68) 


ws [Xserve ds (11.70) 
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so that 

TAO = wriz(t) - X00) anny 
while Eq. (11.68) is 

TXW EC RA) - 2) - 10 (11.72) 


The equations are readily separated and solved. The solution for X (t) 
substituted into Eq. (11.69) gives, for the case of Eq. (11.66), 


I = exp {ik|X (r) — X(o)]} (1173) 
2C 2 —v|T—0o w” 
= exp {- Zel- e | ) = ftlr -ol} 


where we have defined 


v? = wet = = AEL) 
W 


The result is correctly normalized, since it is valid for k = 0. Upon 
substitution from Eq. (11.73) into Eq. (11.63) there results an integral 
over k which is a simple gaussian, so that substitution into A gives, in 
the limit 6 — oo, 

—1/2 


CO Ds sx eee 
A=a—> / wr ie 7 (1— e7) e7 dr (11.75) 
0 


ple 





To find B, we need (|x(t) — x(s)|*) This can be obtained by expand- 
ing both sides of Eq. (11.73) with respect to k up to order k*. Therefore 


1 AN ae 
= (x(7) — x(0)?) = 


The integral in B is now easily performed and, in the @ — oo limit, the 


expression simplifies to 


3a 3G 3 
VW 4 Y 


2 
aee e -zir — o| (11.76) 





BEES 


In addition we need £9, the ground-state energy associated with our 
action S”. This is most easily obtained by noting that, in parallel with 
Eqs. (11.2) and (11.13), 


e-e = im f f e° “Da(u ) dTa 
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Differentiating both sides with respect to C, one finds immediately 


dE, B | 
ae te (11.78) 
so that, in view of Eqs. (11.77) and (11.74), integration gives 

3 
a z — w) (11.79) 


since Æ = 0 when C = 0 (the free particle). Finally, using Eqs. (11.14) 
and (11.62), we obtain for the true ground state energy 

2 
By E a E E 3 w= w)" 

2 4 
with A given in Eq. (11.75). The quantities v and w are two parameters 
which may be varied separately to obtain a minimum. 

The integral in A, unfortunately, cannot be performed in closed form, 
so that a complete determination of Eo requires numerical integration. 
It is, however, possible to obtain approximate expressions in various 
limiting cases. The choice w = 0, corresponding to a fixed harmonic 
binding potential in Eq. (11.61), leads to 


U ee i —VT\— 1/2 —T P(1/v) 
A=a(-| í (Leg Re dr = Tr /24 1/0) (11.81) 


_A (11.80) 


and to Eg = 3u/4. The case of large œ corresponds to large v, in which 
case e~’7 can be neglected, so that A = a(v/m)!/?. For a less than 
5.8 and w = 0, Eq. (11.80) does not give a minimum unless v = 0, so 
that the w = 0 case does not give a single expression for all ranges of 
a. In spite of this disadvantage, the result with Eq. (11.81) is relatively 
simple and fairly accurate. For œ > 6, only fairly large values of v are 
important, and the asymptotic formula (good to 1 per cent for v > 4) 


u N 1/2 2ln2 
A=a(>) (1+ - j (11.82) 


is convenient. Fröhlich, however, considers the discontinuity at œ = 6 as 
a serious disadvantage — a disadvantage which can be avoided in our 
present approach by choosing w different from zero. 

Let us study Eq. (11.80) in case w is not zero. For small a, the 
minimum will occur for v near w. Therefore, we write v = (1 + ew, 
consider € small, and expand the root in Eq. (11.75). This gives 





ne U n —3/2 —WT\ -T dT 
azot fief T R kee Ue aia (11.83) 
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The integral is 
(2/w)[(1+w)'/* — 1] = P (11.84) 


The problem of Eq. (11.80) then corresponds, in this order, to minimiz- 
ing E 


Eo = 2we* — a — ae(1 — P) (11.85) 
That is, 
2a(1— P) 
_ 11.86 
E a (11.86) 


which is valid for small œ only, because € was assumed small. The 
resulting energy is 


a*(1— P}? 


i lial SW 


(11.87) 


Our method therefore gives a correction even for small œ. It is least for 
w = 3, in which case it gives 


2 2 
Q Q 
bece a -1.23 (=) 11.88 
T a 10 alae 
It is not sensitive to the choice of w. For example, for w = 1 the 


1.23 falls only to 0.98. The method of Lee and Pines! gives exactly 
the result of Eq. (11.88) to this order. The perturbation expansion has 
been carried out to second order by Haga,” who shows that the exact 
coefficient of the (a/10)* term should be 1.26, so that our variational 
method is remarkably accurate for small a. 

The opposite extreme of a large a corresponds to large v and, as we 
shall see, to w near 1. Since v > w, the integral Eq. (11.75) reduces in 
the first approximation to Eq. (11.81), which we can use in its asymp- 
totic form. The next approximation in w can be obtained by expanding 
the radical in Eq. (11.75), considering w/v < 1. Furthermore, e~”7 is 
negligible. In this way we get 


=) 1/2 2In2 ? 
_ esl) 2m2 _ w (11.89) 
4 Y 19 U 2U 





1T.-D. Lee and D. Pines, Interaction of a Nonrelativistic Particle with a Scalar 
Field with Application to Slow Electrons in Polar Crystals, Phys. Rev., vol. 92, 
pp. 883-889, 1953. 

26. Haga, Note on the Slow Electrons in a Polar Crystal, Prog. Theoret. Phys. 
(Kyoto), vol. 11, pp. 449-460, 1954. 
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This is minimum, within our approximation of large v, when w = 1 and 
v = (407/97) — (41n2 — 1). Then we find? 


Q2 


3 
Bese 81D = =| N a a 2: 82 
5 oe n : 0.106la 829 (11.90) 


The approximations do not keep Æo as an upper limit because, unfortu- 
nately, the further terms, of order 1/a*, are probably positive. 

Detailed and numerical work based on this approach has been carried 
out by T.D. Schultz.* Using a digital computer, Schultz worked out 
values of v and w which would give a minimum for several different 
values of a. He also evaluated Ko and compared it with the values which 
would be obtained from several alternative theories. In particular, he 
worked out the self-energy from the theories of Lee, Low, and Pines’ 
(Eup), Lee and Pines* (Ep), Gross? (#,), and Pekar, Bogoliubov,’ 
and Tyablikov® (Ept). 

The results for a, v, and w and also for the energies given by the 
Feynman theory (£;) compared with energies derived from the other 
theories are given below, in a table reproduced from the paper of Schultz. 
In this table, both A and w are assumed to have the value 1. Note that 
for all values of a, the value of Æp is less than all others. 


a | 300 | 500 | 700 [ 900 | 1100 
u 
E; 
-3.0000 | 5.0000 | -7.0000 | -900| 
Ey 


aaa aaa aa Ia allaaaaaaaaaalliaalaaalaaauaassssssssssssussl$ealeaealtt$luliiaaaaltl$l$l$ltll 


1S.I. Pekar in Theory of Polarons, Zh. Eksperim. i Teor. Fiz., vol. 19, pp. 796-806, 
1949, has shown that Eo goes as —0.1088a” for the case of large a. 

2T.D. Schultz, Slow Electrons in Polar Crystals: SelfEnergy, Mass, and Mobility, 
Phys. Rev., vol. 116, pp. 526-543, 1959. 

3T.-D. Lee, F.E. Low, and D. Pines, The Motion of Slow Electrons in a Polar 
Crystal, Phys. Rev., vol. 90, pp. 297-302, 1953. 

4 Op. cit. 

5E.P. Gross, Small Oscillation Theory of the Interaction of a Particle and Scalar 
Field, Phys. Rev., vol. 100, pp. 1571-1578, 1955. 

6S.I. Pekar, “Untersuchungen über die Elektronentheorie der Kristalle,” 
Akademie-Verlag, Berlin, 1954. 

TN.N. Bogoliubov, On a New Form of the Adiabatic Theory of Disturbances in 
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Exchange Effects of a Particle with the Quantum Field, Zh. Eksperim. i Teor. Fiz., 
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Other Problems in Probability 


12-1 


IN the preceding chapters we have seen how to use path integrals to 
treat a number of quantum-mechanical problems which are, by their 
very physical nature, probabilistic problems. We have also used the 
path integral method to analyze some aspects of statistical mechanics 
wherein the probabilistic nature of the functions permitted the path 
integral technique to be particularly effective. We can continue this line 
of development into a wide variety of probability problems where this 
approach has special and valuable applications. 

It is the purpose of this chapter to explore a number of these prob- 
ability problems. They will be of two kinds. First we shall discuss 
direct applications of path integral ideas to classical probability prob- 
lems (Sec. 12-1 through 12-6). This is quite different from all preceding 
chapters, in which all applications were to quantum mechanics. Fol- 
lowing that, we shall deal with problems in which both probability and 
quantum mechanics are involved (Sec. 12-7 through 12-10). We cannot, 
in this chapter, deal with these matters in any detail. We shall only out- 
line by some examples how certain problems may be set up and thereby 
suggest to the reader other applications of the path integral approach. 

The main direct application of path integrals to probability problems 
is due to the ability of path integrals to deal directly with the notion 
of the probability of a path or a function. To make this idea clear, we 
proceed in steps from the well-known! ideas of probability applied to 
discrete events and to continuous variables. 


RANDOM PULSES 


To start with, suppose we consider a typical probability problem for a 
discrete variable. We are given a situation in which a series of discrete 
events is taking place at random times, e.g., cosmic rays striking a de- 
tector or raindrops falling on a specifically demarked area of ground. 
We know that the particles fall at random times, but in any long period 
of time T' we expect ñn = uT particles will be observed. In other words, 
u is the mean counting rate. 

Of course, in any actual measurement the exact number of particles 
n recorded will not, in general, correspond to the expected number. But 
we can ask directly: “What is the probability of observing a particular 
number n of particles during a period when the expected number of 


‘Harold Cramér, “Mathematical Methods of Statistics,” Princeton University 
Press, Princeton, N.J., 1951. We assume knowledge of usual probability theory. 
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particles is n?” It is given by the Poisson distribution 


nr 
Vn = m (12.1) 


On the other hand, we might ask a probability question of a different 
kind. We might, for example, ask: “What is the probability that the 
interval from one particle impact to the next will be some particular 
time t?” Actually, there is no correct answer to the question phrased 
this way. If we were to ask the probability that the time interval will 
be equal to or greater than t, then we could give an answer (it is e7#*). 
That is, we can get an answer to a question about t falling within a 
certain range. ‘Thus, if we are interested in a particular value, we must 
allow ourselves an infinitesimal range and ask the question: “What is 
the (infinitesimal) probability that the time interval will fall within the 
range dt centered on t?” The answer is written as 


P(t) dt = pe“ dt (122) 


So we create a concept of a probability distribution of a continuous 
variable: P(t) is the probability per unit range of t that the interval is 
t. We write the probability distribution of x as P(x) if P(x) dz is the 
probability that the variable lies in the range dz about x. We can easily 
extend this to two variables and write the probability distribution of x 
and y as P(xz,y) dxdy. By this we mean that the probability of finding 
the variables x and y in the region R of the zy plane is given by 


j P(x, y) dx dy 
R 


We wish to expand the concepts of probability still further. We want 
to consider the distribution not of single variables but of complete curves; 
i.e., we want to construct probability functions, or rather functionals, 
which will permit us to answer the question: “What is the probability 
of obtaining a particular time history of a physical phenomenon, such as 
the voltage in a resistor or the price of a commodity, or, in two variables, 
the probability of a certain shape of the surface of the sea as a function 
of latitude and longitude?” Thus, we are led to consider the probability 
of a function. 

We shall write it down this way. The probability of observing the 
function f(t) is a functional P|f(t)|. But we must be careful to remember 
that questions relating to such a probability have meaning only if we 
define the range within which we are looking for a specific curve. Just 
as in the example above we had to ask the question: “What is the 
probability of finding the time interval within the range dt?” so now 
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we must ask: “What is the probability of finding the function within 
some more or less restricted class of functions, for example, those curves 
which are bounded between values a and b for the complete time history 
in which we are interested?” If we call such a subset of curves the class 
A, then we ask: “What is the probability of finding f(t) in the class A?” 
and we write the answer as the path integral 


J PIOIDEG (12.3) 


where the integral extends over all functions of class A. 

Actually, this expression can be thought of as similar to the proba- 
bility function for a number of different variables. If we imagine time to 
be divided into discrete intervals (as we imagined it when we were first 
defining path integrals in Chap. 2) taking on the values of tı, t2,..., then 
the values of the function at those particular times f(t), f(te),... = 
fi, f2,... are analogous to the variables of a multivariable distribution 
function. The probability of observing a particular curve can then be 
thought of as the probability of obtaining a particular set of values 
fi, fo,... in the range dfi, dfo,..., that is, P(fi, fo,...) dfi dfa---. 

If we then proceed to the limit as the number of discrete intervals 
in time becomes infinite, with suitable normalization, we obtain the 
probability of observing the continuous curve f(t) in the range Df(t) 
as the integrand in the path integral of Eq. (12.3). It is this probability 
concept and this probability functional with which we shall be working 
in the remainder of this chapter. 


CHARACTERISTIC FUNCTIONS 


It is helpful to continue using the analogy between the probability func- 
tional of a path and the more traditional probability function of a vari- 
able. A number of concepts, such as the concept of a mean value, are 
common to the two approaches. With usual probability distributions 
for quantities which have discrete values, so that the probability of ob- 
serving the specific number n is Ph, the mean is 


2 ie =n (12.4) 
n=l 


For a continuously distributed variable, it is 


I 2G Ane, (12.5) 
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and in an analogous fashion, the mean value of the functional Q|f(t)} is 
written 


JQfOMPFOQIDIO _ 
oDe O — 


In this last equation, as in Sec. 11-1, we have included a path integral 
in the denominator to remind ourselves that we are always faced with 
normalizing problems. In principle, it would be possible to work out 
the path integral of the distribution function, set it equal to 1, and so 
evaluate the normalizing constant to begin with. However, in many 
practical cases it is more convenient to leave the function unnormalized 
and simply cancel out factors on the top and bottom of the expression 
which might, in actuality, be extremely difficult to evaluate. 

Just as the mean value of the function can be expressed in the path 
integral notation, so can the mean-square value of the function at a 
particular time, say t =a. Thus, 


boy I PPFD I) 
FO) = TPL] DIO 


for this is only a special functional. 

One of the most important mean values of a function, as evaluated 
with Eq. (12.5), is the mean of e***. It is called the characteristic func- 
tion, and it is 


olk) = (eit) = [c= P(e) da (12.8) 


(12.7) 


This is sometimes also called the moment-generating function. It is 
simply the Fourier transform of P(x), and it is an extremely useful 
function for evaluating various characteristics of the distribution, since it 
is equivalent to a knowledge of the distribution function itself. This last 
fact is the result of the possibility of performing the inverse transform 
as 


—ikr dk 
P(z)= | e7 elk) 57 | (12.9) 
J= 2N 
A number of important parameters of the distribution can be deter- 
mined by taking the derivatives of the characteristic function. Thus, for 
example, the mean value of x is 


(a) = ~i E 





(12.10) 





k=0 
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as is readily demonstrated by differentiating each expression in Eq. (12.8) 
with respect to k and then setting k = 0. In fact, a series of such relations 
exists: 


o(0)=1 PO =i(2) P=) (12.11) 


Of course, our next step is to generalize the concept of the char- 
acteristic function to the functional distribution case. We construct a 
mathematical definition of such a generalization by returning to our 
picture of discrete time intervals. We then wish to perform the Fourier 
transform on the probability function of a large number of variables, us- 
ing the kernel e’*1/1 e**2/2 .... As we go to the limit of an infinite number 
of time intervals, this becomes simply a BEE This, then, is the 
functional whose mean value we wish to take in order to develop the 
characteristic functional. By using Eq. (12.6), we obtain 


ff ot J KOFE “ Pl f(t)| Df (t) 
PEO = "TF PROIDF) 


This characteristic functional also has important special properties. For 
example, [0] = 1, and the mean value of the function f(t) evaluated at 
the particular time t = a is 

_ 6&[k(t)| 


(f(a)) =i FTk(a) 


(12.12) 


(12.13) 





k(t)=0 
where we have used the technique of the functional derivative as de- 
scribed in Sec. 7-2. 

In principle, we can invert our path integral Fourier transform and 
write the probability functional as 


PIE] = | et) OTO 41K ()] Dk(t) (12.14) 


where now, of course, the path integral is carried out in the space of the 
k(t) functions. 

We may remark, for use in interpretation later on, that if the func- 
tion f(t) is not uncertain but is definitely known to be some particular 
function F(t), that is, P{f(t)] is zero for all f(t) except f(t) = F(t), 
then the characteristic functional is 


Dk] = eb J FOF at (12.15) 
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NOISE 


Suppose we apply the ideas so far developed to a particular example and 
in the process develop a few more concepts. Let us consider the situation 
in which we are counting some sort of pulses, perhaps pulses generated 
by the impact of cosmic rays on a Geiger counter or perhaps thermal- 
noise pulses in a resistor. In such cases the pulses are not simply discrete 
spikes of energy but are represented by arising and falling voltage. Thus, 
careful inspection of the actual voltage history associated with such a 
pulse would show that it has the form g(t) for a pulse occurring at t = 0. 
So, if the pulse occurred at tp, the shape of the voltage curve would be 
g(t — to). 

Now, suppose we conduct our counting experiment for the time in- 
terval of length T (much longer than the length of a single pulse) during 
which a number of pulses centered on the times t;,to,...,t¢, occurred. 
The complete voltage history over this experiment would be 


7 
f(t) =) g(t — ty) 
j=l 
Since we know when all the events occurred, our probability function 
would simply be the representation of certainty, and by use of Eq. (12.15) 
the corresponding characteristic functional becomes 


®[k(t)] = exp l iY Í k(t)g(t — ts) dt (12.16) 
j=l 


But now suppose that we wish to determine the probability of find- 
ing a particular time history of the voltage before conducting the exper- 
iment. In that case we permit the n events to be randomly distributed 
with uniform probability over the complete time interval. That is, the 
probability of an event happening within the time interval dt is dt/T. 
In this case the characteristic functional becomes 


TO PF _~ dt, dtz dtn 
SRH = | af | exp iD? f KOI- 4) a ge te 
j=l 


T n 
= f oi J EE+S)g (E) dt 3 (12.17) 
0 


We call the expression in parentheses A and write this result as A”. 
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If the number of events in the time interval is distributed in such a 
way that the Poisson distribution applies, i.e., the occurrence of each 
event is independent of the time of occurrence of any other event and 
there is a constant rate u for the expected number of events per unit 
time, then the expected number of events in the time interval T is 
n = uT. The characteristic functional is 


5{k(t)] = ya (12.18) 


The sum on the right-hand side of this equation is the expansion of an 
exponential function, so that we can write the characteristic functional 
as 


B[k(t)] — ,-(1-A)n _ Tid a i f k(t+8)g(t) dt ds 
=E =exp 4 —u e T 


T 
= exp fn (1 — ei J KO+S)g C) 2) as] (12.19) 
0 


Thus, we may determine the characteristic functional for many different 
situations. We next go on to discuss this result under various approxi- 
mate circumstances. 

Suppose we imagine that the pulses get very weak while the expected 
number of pulses per unit time, that is u, becomes large. In that case 


; i | k(t t) dt . , 
g(t) is small, so we can expand ef J Bets)a(#) dt in a power series and we 
can approximate the characteristic functional as 


exp g [f k(t + s)g(t) das} = exp finc [ k(t) a (12.20) 


where we have defined G = f g(t) dt, the area of the pulse. This means 
that ®[k(t)] is in the form of Eq. (12.15) with F(t) = uG (a constant 
independent of t). This is equivalent to saying that f(t) is certainly 
uG or, in other words, that there is unit probability for observing the 
function f(t) = uG and zero probability for observing any other f(t). 
That is to say, the pile-up of a large number of small pulses generates a 
nearly steady direct voltage of value equal to the number of pulses per 
second u times the average voltage G supplied by each. Next, we go to 
one higher approximation and study the fluctuations or irregularities of 
this nearly constant voltage. 

Equation (12.20) is a first-order approximation to the exponential 


os ktsg) dt in the description of the characteristic functional of 
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Eq. (12.19). Suppose now that we go on to the next-order approxi- 
mation and include the second-order term. This is 


-E fJ ROI t+s Jat | KEI t + s) dt’ ds (12.21) 


To simplify this expression, we define a function which measures the 
overlap between two nearby pulses as 


A(T) = | sate +7) dt (12222) 


By use of this substitution, the second-order term is reduced to 


-5J f o A(t — t') dt dt! (12.23) 


Including both first- and second-order terms, the characteristic func- 
tional is 


O[k(t)] = pine f k(t) dt ,—(u/2) fS KEKE )AEH-#’) dt ae’ (12.24) 


The first factor in this expression is the constant average level, which 
we might call the DC level if we are thinking about voltage pulses. We 
can, if we wish, neglect this level and concentrate only on the variations 
around it by shifting the origin of f(t). That is, we can always take 
out a factor gd E by shifting the origin of f(t) (i.e., by writing 
f(t) = F(t)+f'(t) and studying the probability distribution of f’(t) and 
its characteristic functional). If we make this change of origin, we are in 
a position to study the fluctuations of voltage around the DC level. 

We note one special approximation to Eq. (12.24) which is often 
adequate. Generally, A(T) is a narrow function of r. The pulse shape 
g(t) rises and falls with a finite width, so if two pulses are spaced a very 
great distance apart, their overlapping area vanishes. This is another 
way of saying that A(T) approaches 0 rapidly as r becomes large. As a 
result of this, if A(T) is narrow enough, the second factor in Eq. (12.24) 
can be approximated by 


07 (4/2) f k(t) dt (12.25) 


CO 
where q= LL f A(r)dr. This is equivalent to the probability distribu- 
— CoO 


tion 
PIE] =e C22) J PO at (12.26) 


Such fluctuations are often called gaussian noise. 
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Characteristics of distribution functionals describing noise functions 
have been studied extensively in recent years in the theory of communi- 
cations. A number of characteristics of noise spectra have been defined 
and evaluated, and we shall carry through similar discussions here and 
in the next section, where we treat gaussian noise. 

Now we shall continue to show, by giving one further example, how 
characteristic functionals are set up. We shall consider pulses which 
come at random times all with a given characteristic shape, say u(t), but 
each with a different scale height, so a typical pulse is written au(t). We 
might allow the height a to be either plus or minus. So now we suppose 
the timings of the pulses are randomly spaced instants ¢; and the heights 
take on random positive and negative values a;. The resulting function 
is 


= X aju(t — t;) (12.27) 
j=l 


If first we set aside the random nature of the events, we obtain a char- 
acteristic functional equivalent to that of Eq. (12.16) as 


®[k(t)| = exp La [Ho u(t — t;) dt (12.28) 


Next, if we include the presumed random nature of the scale heights of 
the pulses and say that the probability of obtaining for the jth pulse 
a particular scale height of a; in the region da; is p(a;)da;, then the 
characteristic functional becomes 


i= fe J| e La [Ho u(t — tj) dt 


x p(a,) da, plaz) daz- plan) dan (12.29) 


Of course, each of these probability functions for the values of a; has as- 
sociated with it a characteristic function (also called a moment-generating 
function). We call this function 


W [w| = J e'4n(a) da (12.30) 
— 00 
Then the expression for ®[k(t)] is 


=LI [Sk(t)ult — t;) dt] (12.31) 
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Now we can proceed as we did in the derivation of Eq. (12.17) and 
introduce the notion that the exact time at which a pulse occurs is ran- 
domly and uniformly distributed over the interval 0 < t < T. If we 
suppose that there are precisely n pulses in this interval, the character- 
istic functional becomes 


Dk = (59 (12.32) 


where 
= [w | [k(t)u(t — s) dt] ds (12.33) 


If again we assume, as we did in the derivation of Eq. (12.18), that 
the pulse distribution satisfies the Poisson distribution, then we must 
multiply Eq. (12.82) by (n”/n!)e~” and sum over n to get 


İkt] = e~H(T—TIkK®))) — exp fu fa- W | fk(t)u(t — s ) dt] ) is} 
(12.34) 


As a special example of this result, we assume that that pulse shape 
is extremely narrow. In fact, we assume that we can approximate the 
shape function by a Dirac delta function, that is, u(t) = 6(t). Then the 
characteristic functional is 


®[k(t)] = exp -uf (1 — W[k(s)]) ds} (12.35) 


Next, we assume that the distribution of scale heights is gaussian with 
zero mean and a root-mean-square value of a; in other words, the ordi- 
nary normal distribution is given by 


l 


2T O 





p(a) da = e70 /20" da (12.36) 


In that case, the characteristic function is 
W [w] = Ja (12.37) 


and for ® there results 
{k(t)| = exp {=u f (1 — e7 (0/2) (8) | is| (12.38) 


So we find again that a characteristic functional ®[k(t)] can be de- 
rived to fit our assumed conditions. At any stage in this derivation, ap- 
proximations that would reduce this to a quadratic form may be valid. 
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For example, in the case just described a small value of the root-mean- 
square scale height o corresponds to weak signals. If, at the same time, 
the expected number of signals arriving in a time interval is not small, 
then Eq. (12.38) can be approximated quite well by 


®[k(t)| = exp [a7 feo it} (12.39) 


A distribution like this is called white noise. 


GAUSSIAN NOISE 


The type of distribution whose characteristic functional is gaussian comes 
up in many situations, and we shall discuss it here. 

We have been working with probability distributions which are gaus- 
sian, i.e., exponentials of second order in the defining functions. Al- 
though we arrived at this gaussian functional by making a second-order 
approximation to the exponential term introduced by our assumption 
of a Poisson distribution of random pulses, it is worth remarking that 
a number of physical processes actually seem so distributed by their 
nature. In traditional probability theory the normal, or gaussian, distri- 
bution fits physical phenomena which are the result of the combination 
of a large number of independent events occurring randomly. This is the 
conclusion of the central-limit theorem of probability theory.’ The same 
conclusion applies to distribution functionals and results in the fact that 
many important cases for study of physical phenomena have gaussian 
distributions. For further reference, we write here the most general form 
of a gaussian characteristic functional as 


Dik] = ot | KOFE) dt ,—(1/2) ff KOKEA, t) dt de! (12.40) 


The first factor in this expression can be removed by a shift of 
the origin defining f(t), as we discussed in deriving the distribution 
of fluctuations of voltage around a DC level. Thus, we could define 
F = f(t) — F(t). Next we note that, if the system we are describing 
behaves in a manner independent of the absolute value of time, then the 
kernel A(t, t) must have the form A(t — t’). 

In actual physical situations this function A may be defined by mech- 
anisms in some sort of experimental situation or by approximating a 
particular piece of reality in such a way that it behaves nearly like the 
distribution function we are studying. We have an example of such an 


l Tbid., pp. 213ff. 


12-4 Gaussian noise 333 


approximation in the derivations given above on the noise spectrum. 
For it, A(t,t’) = uA(t— t). In either case theorems of the behavior of 
the system which result from the use of this function will be the same 
so long as the characteristic functional ® can be suitably approximated 
by the quadratic or gaussian form of Eq. (12.40). 

Of course, by now we know how to deal with gaussian functionals, 
since we have spend quite a bit of time in the preceding chapters manip- 
ulating them in one way or another. In this particular case the appear- 
ance of the factor 7 is different from that in typical quantum-mechanical 
cases. This means that functions which were real in Sec. 7-4, for ex- 
ample, are imaginary here. However, this does not require any review 
of the mathematical aspects of the subject; it simply is an awareness of 
and preparation for certain differences in detail in the results. 

The probability distribution which corresponds to the characteristic 
functional of Eq. (12.40) is 


P[f(t)] = exp -4 f [FO - FOE) - FEBE?) at ae | (12.41) 


where the function B(t, t’) is a kernel reciprocal to A(t, t’). That is, the 
functions A and B are related by 


J Ae Adie E E E (12.42) 


Problem 12-1 Prove this. 


All the parameters of the distribution can be calculated from the 
characteristic functional by the methods introduced in Chap. 7. 

We shall now study in more detail some of the physical character- 
istics of gaussian noise that is time-independent; i.e., we shall study 
distributions whose characteristic functional is 


SkA] = aa) Jf KOKEA- t) at ae! (12.43) 


This function A(T) is called the correlation function. Eq. (12.43) means 
that the probability of observing a particular noise function f(t) is 


PIFO] = e72 ff EOFEB E-t) dt ae’ (12.44) 


The function B appearing in this last expression is the inverse of the 
correlation function A. That is, f A(t — s)B(s) ds = ô(t), or, if 


PaE J ae dT (12.45) 
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is the Fourier transform of A(T), the Fourier transform of B(T) is 1/P(w). 

We shall begin by calculating some of the properties of this distribu- 
tional. We first show that the average value of the noise signal vanishes. 
This is because the average value of the noise function at a particular 
time t = a is, as in Eq. (12.13), 


(Fla) = -iT (12.46) 


In this expression, the functional derivative of ® in Eq. (12.43) is given 
by (see Sec. 7-2) 

ô È 
—— = |— | k(t)A(t-a)dt| © 12.47 
= [> f RACE = a) at (12.47) 


and, if it is evaluated for the particular function k(t) = 0, then it be- 
comes 0. 

Next we calculate the average of the square of the noise function or, 
better, the expected value of the product of two noise functions at times 
a and b. This is called the correlation function of the noise. It is (by 
differentiating both sides of Eq. (12.12) twice) 


67® 
(F(a) f(6)) = ~ 5k(a)oK(b) (12.48) 
= A(b—a)® — [fk(t)A(t — a) dt] | [k(t') A(t’ — b) dt’| ® 


and, if this is evaluated for the function k(t) = 0, it is simply A(b — a). 
That is why A is called the correlation function. 


NOISE SPECTRUM 


A most useful characteristic of the noise distribution is the power spec- 
trum of the noise (see Prob. 6-26), which is defined as the mean value 
of the square of the Fourier transform of the noise function, that is, the 
mean square of 


a J Fedt (12.49) 
By using our previous results, we can evaluate this as 
> inet da f f(b)e~*” ab) 
=y je l-0) da db 
= -gjat —a)e ae ») da db 
= fP(w) da (12.50) 
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Here we have made use of the function P(w), the Fourier transform of 
the correlation function A (see Eq. 12.45). | 

If we carried out the integration shown in the last step of Eq. (12.50) 
we would, of course, get an infinite result. Therefore, the mean-square 
value which we are attempting to work out can be defined only for some 
finite time interval. If we take a unit time interval, then we say that the 
mean power per second is 


Mean of |¢(w)|* per second = P(w) (12.51) 


We can apply some of these general results to our special example of 
noise produced by a multitude of small pulses. The correlation function 
for our problem is wA(rT) introduced in Eq. (12.22). That is, 


A(t) = u | ggl + 7) dr (12.52) 


This means that the power spectrum is 


ne J J st el dr dt = ply(w)|2 (12.53) 


where 7(w) is the Fourier transform of our pulse function g(t). We can 
explain this simple result more directly for our problem as follows. If 
the pulses occur at times t; so that f(t) = Lal (t — ti), the Fourier 


transform of f(t) is d(w = Lu) Venu, Thus the square of o(w) has 


the average 


(lew) = (In) w)|° a" i a) (12.54) 


But, since the times t; are random, and independent of t; for i Æ j, all 
the terms with i Æ j average out, because the average of e’”(%—*) is 
zero. Only the terms with i = j remain. Each is |y(w)|?, and they are 
uT in number; so the mean of |¢(w)|*? per second is p|y(w)|?. 

In the special case that the characteristic function can be approx- 
imated by the white-noise characteristic of Eq. (12.39), the function 
A(t — t’) = const 6(t — t’). This means that P(w) is independent of w 
and there is the same “power” per unit frequency range (mean |é(w)|? 
per second) at all frequencies. 

The distributions we are describing can very conveniently be de- 
scribed by giving the probability distribution not for f(t) but for its 
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Fourier transform ¢(w) directly, and the characteristic functional not in 
terms of k(t) but its Fourier transform 


K(w) = [wpe dt | (12.55) 
Using these functions, the characteristic functional for the noise distri- 
bution corresponding to Eq. (12.43) is 

p — e7 (1/2) f IKW)? P(w) dw /2n (12.56) 


by direct substitution of the inverse of Eq. (12.55) into Eq. (12.43). The 
corresponding probability functional is 


P = e7 (1/2) JUA) /P(w)] dw/27 (12.57) 
We deduce Eq. (12.57) from Eq. (12.56) as follows. Note that 


[k(t) f(t) dt = [K*(w)b(w) dw/2r (12.58) 
so that Eq. (12.14) implies 


P= J Dei J K elo) dw/2 DK (w) (12.59) 


If we now imagine the possible values of w to be discrete and sepa- 
rated by an infinitesimal spacing 27A, the integrals in the exponent in 
Eqs. (12.56) and (12.57) can be replaced by Riemann sums, and our 
path integral becomes 


P=T] J e- UDIK A oik W)OWIA IK (w) (12.60) 


The integral for each value of w can be done separately (by completing 
the square), and we get 


P = J] EDA (12.61) 


Putting the product together gives Eq. (12.57). It is clear that what 
happens at one frequency is independent of what happens at another, 
and that the signal strength ġ(w) at frequency w is distributed as a 
gaussian with a mean square proportional to P(w). 
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BROWNIAN MOTION 


It is usually true that the path integral method does not really help 
to get the solution to problems that cannot be solved in some other 
manner. Nevertheless, someone who has followed us this far and who 
is now familiar with path integrals will find its mode of expression and 
logic very simple and direct when applied to probability problems. 

For example, in the theory of brownian motion we might have a 
linear system — say, a damped harmonic oscillator being driven by a 
fluctuating force f(t). Assume the mass of the oscillator equal to 1, and 
we must solve | 


Z(t) + ye(t) + wea(t) = f(t) (12.62) 


where x(t) is the coordinate of the oscillator. If the function f(t) is 
not known but is given by a known probability distribution Pr|f(t)], 
what is the probability distribution P,[z(t)] for the various responses 
a(t)? Equation (12.62) relates z(t) to f(t); that is, for each f(t) there 
is an z(t). Hence the probability of given z’s is the same as that for the 
corresponding f’s, or | E 


P le] Da(t) = PODIO (12.63) 


where z(t) is related to f(t) via Eq. (12.62). In general, we must be very 
careful in relating path differentials like D(t) to Df(t), there being an 
analogue of a “jacobian” between the “volume” elements. But if f(t) 
and x(t) are linearly related (as above), this jacobian is a constant; so 
if, as is usual with path integrals, we can trust ourselves to normalize 
our answer in the end, we have 


P,[x(t)] = const Ps [#(t) + yilt) + wêrt) (12.64) 


which gives us a formal solution. If Py is gaussian, then Py is and the 
problem may be worked out in many ways, the most evident being by 
the method of Fourier series if wé and y are independent of time. 

At any rate, many problems can be set up and solved or partly solved 
by using Eq. (12.64) as a starting point. We shall look at a specific 
example. A fast particle goes through matter in which it receives small, 
sharp alterations in velocity as a result of passage by nuclei. After going 
through a thickness 7’, what is the probability it will emerge a distance D 
from the origin (the extension of its original straight-line path) moving 
with deflection angle @ as in Fig. 12-1? 
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Fig. 12-1 A fast particle impinges perpendicularly on a slab of matter of thickness T’. 
After traveling through a thickness t measured parallel to its original line of flight, it 
is deflected away from its original trajectory (as extended) by a distance x owing to a 
number of interactions with the nuclei in the material. Eventually, it emerges from the 
slab a distance D from the point O, at which it would have emerged with no deflection, 
and is traveling in a direction that makes an angle @ with its original direction. 


We assume that the interactions cause no measurable loss in the 
longitudinal velocity of the particle and that the matter through which 
the particle passes is homogeneous. Further, we assume that @ is always 
small and that the motion is the result of a large number of collisions 
each of which has a small effect. We assume that the expected number of 
collisions in the infinitesimal thickness dt is udt and that the deflection 
suffered in each collision is given by the angle A, which is governed 
by the probability distribution p(A) dA. We further assume that this 
probability distribution results in a mean-square value of A given by 


J ” A2p(A) dA = 0? (12.65) 


and we shall use the definition R = po’. 

We shall confine our attention to the motion as projected onto a two- 
dimensional plane containing the original path of the particle. Motion 
in a plane normal to this will follow similar rules, and the motion in 
either plane can be considered independently of the other. We shall use 
t to measure the depth of penetration into the slab, 0 to represent the 
instantaneous direction of motion in the plane we are considering, and 
x to measure the position of the particle away from an extension of its 
original path of motion, as shown in Fig. 12-1. These parameters are 
related by dx = 0 dt, or t = @. 

We assume that the deflections of @ occur suddenly, so that 6 = f(t), 
where the functions f(t) are a set of randomly spaced delta functions 


12-6 Brownian motion 339 


having random scale heights. This means that %(t) = f(t) and P [f (t) 
has the characteristic functional (see Eq. 12.34) 


®[k(t)) = e7” J O- WIR) a (12.66) 
where 
Ww] = J aye dA (12.67) 


We note that the mean value of A is assumed to be 0, and these deflec- 
tions themselves are assumed small. Now if we expand W|w] as 


Ww] = [oo ( + iwA — ZA? + ) dA (12.68) 


and use terms only through second order in A to get W fw] = 1—w°0?/2, 
then 


DRE] = e70/DR f K) ds (12.69) 
This in turn implies (Eq. 12.44) that 

Pir = e G2 Od (12.70) 
Hence 


| T 
P,,{a2(t)| = const exp -a P) a (12.71) 


We wish to evaluate the probability distribution P(D, 0), which gives 
the probability that the particle will exit with displacement D and angle 
of motion @ when it enters with initial conditions (0) = 0 and (0) = 0. 
We are concerned not with the exact path that the particle takes in the 
material, but only that the particle exits with <z(T) = D and 2(T) = 8. 
Thus, we express this probability distribution by an integral over all 
paths as 


7 | 
P(D,@).= [ox -a ž (t) a} Delt) (12.72) 


where the paths included in the integral satisfy the assumed end-point 
conditions. This integral can be carried out by the methods of Sec. 3-5. 
The integral is a gaussian and becomes an extremum for the path 


dtz 
-i =0 (12.73) 
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The solution of this equation, which satisfies our assumed boundary 
conditions, is 


t t 


#(t) = (3D — 6T) (=) e (2D — OT) (+) (12.74) 


By using this path in the integrand of the exponential function in Eq. 
(12.72), we find 





ae 6 6T\* e 
ota r ETOR Eei oo 12. 
T) OLEA (D > | + oar (12.75) 
which means that our required probability distribution is 
6 6T\* 
== — D E — 2. 
P(D,@) = const =p] JE ( 5 ) fr) (12.76) 


In some practical cases we may really be concerned not with the 
exact linear spacing of the particle away from our assumed origin point 
but, rather, with the deflection angle at which it leaves the slab. Given 
the overall distribution function of Eq. (12.76), it is simple to evaluate 
the distribution function in angle alone by integrating over all values 
of D. The result is e~® /22T. This is an expected result, because we 
have already assumed that the mean-square value of the deflection angle 
which would be acquired in a unit thickness is R, so this value in a total 
thickness J’ should be RT. 

Suppose next we look only at particles which emerge traveling in a 
specific angle @ and consider the distribution function of the emerging 
positions D of those particles. We find that the probability distribution 
has a maximum at D = 07/2. This would be the position we would 
expect if the final deflection angle 0 were acquired in a smooth manner 
as a linear function of thickness starting from 0 and building up to its 
final value. In that case its average value during the passage through 
the slab would be 0/2. 


Problem 12-2 Show that the constant required to normalize the 
probability function P(D,@) dD dé is 


TE aay ae (12.77) 
ees TRI? V 2r RT l 
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QUANTUM MECHANICS 


In this and the following sections we should like to see how to formu- 
late statistical problems in quantum mechanics. In quantum mechanics 
there are probabilities involved in an intrinsic way, because even a known 
state implies probabilities to be found in other states. But in addition 
there may be extrinsic uncertainties. ‘The state, for example, may not be 
known — we may know only that the state is such and such with a cer- 
tain probability. This situation is analogous to the classical-mechanics 
situation in which the initial conditions are not known and only a proba- 
bility distribution for such conditions is available. We have already dealt 
with such a situation in statistical mechanics (see Chap. 10), but that 
is a very special case in which the state of energy E has the probability 
e~E/kT | Here we shall be more general. 

Again, under a given external force, say f(t), the behavior of a 
quantum-mechanical system can be worked out, but what can we say if 
that force is uncertain and has a probability distribution P| f(t)] D f(t)? 
Need we actually solve the problem for each f(t) and then average, or 
is there some way to formulate the problem after the average of f(t) 
is taken? (We hope so, because it often occurs that the solution of a 
statistical problem after an average is taken is, in fact, much easier than 
finding the general solution of the original problem for a wide range of 
conditions.) We shall find such a formulation in this section. Then we 
shall go on to discuss situations in which a quantum-mechanical system 
is disturbed not just by a classical system but by another quantum- 
mechanical system about which there are statistical uncertainties. 

Our main purpose in this chapter is to show how these and other 
questions may be formulated. We shall not deal in detail with solving 
the special problems mentioned; they are brought up only to help us 
understand the more general formulations we shall arrive at. 

We wish first to discuss the analogue of brownian motion for a 
quantum-mechanical system. That is, we shall suppose that a quantum- 
mechanical system whose unperturbed action is S|x(t)| is under the in- 
fluence of an external force f(t) such that its action is’ a 


Dele) | SO) feos dt (12.78) 


We shall do everything as though there were only one coordinate x. One can 


immediately generalize to several coordinates x; (so that a set of forces fi act) and 
to cases in which the coefficient in front of f(t) in the action is not simply x but 
some more complex operator. 


342 12 Other problems in probability 


Suppose we ask: “What is the probability that, starting at some 
time ta with coordinate x(t,) = Za, we shall arrive at a final time tẹ at 
coordinate xp?” It is the square of an amplitude: |K (xp, tb; Za, ta)|?. Or 
again, if we specify that initially a system is in the state of wave function 
W(x) and finally in the state of wave function X(x), the probability of 
transition from w to X is 


P[X(a); ¥(2)] (12.79) 
= TI X (a6) es rarer sar ed ec em bp 
7 JIJI X* (2p )X (xp) K (Lo, to; Zas ta) K” (Xp, tp; Las ta) Y (Ta) Y” (Ta) 


x drsda diydi . 
It is evident that all such problems can be solved if we can evaluate 
K Ostia ta tasted ate) (12.80) 
The first factor is the path integral f ets z] De(t), whereas the second 
factor is its complex conjugate! f e~*°!*! Da(t). Each integral is over 
paths with appropriate end points. In writing the product of Eq. (12.80), 


we shall call the path variable in the second integral x(t) and we can 
then express Eq. (12.80) as the double path integral 


J l iSe O Delt) Dr (t) (12.81) 


The summing of such integrals over various end points gives the required 
probability. 

If the force f(t) is acting, we should replace S[z(t)] in Eq. (12.81) by 
S(x(t)], and the expression becomes 


J J HSO- O S OSO de= f OOD Delt) Dat) (12.82) 


But now suppose the force is known only in a probabilistic sense; i.e., 
we know that there is a probability Pr[f (t) Df (t) that the force is f(t). 
Then the probability to go from w to X is given by Eq. (12.79) calcu- 
lated for each f(t) and then averaged over all f(t) each with the weight 
Py|f(t)| Df(t). This is then 


P[X(z); ¥(z)] = ve 
SCS COUCALACAL CAL (E) dr, dz! A da! 


‘We suppose that S[z(t)] is real and that our units are so defined that h = 1, as 
in Chap. 11. 
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where J is the average of Eq. (12.82) over all f(t) with weight 
Py[f(t)] DFC); thus 


J (Lp, Lp; X t= (12.84) 
J J fe ish Sle" gf Jl -2" OM # pe (F(t) Da(t) De (t) Df (t) 


with the integrals taken between appropriate end points z(t,) = Zz, 
L (ta) =x, L(t) = £p, X(t) = £p. Actually, this choosing of end points 
and then integrating over various values with wave-function distributions 
depending on the problem (as in Eq. 12.83) is simply a sum of J’s for 
different end conditions, and we shall hereafter simply forget this and 
speak as though with J we already have our probability — it being left 
to the reader to remember that a bit more has yet to be done. This is 
so that we can concentrate on the main feature, the evaluation of the 
double path integral needed to calculate J. 

In this form we can do the integral over f(t) explicitly and see that, 
to find the probabilities after averaging, we must evaluate a double is 
integral 


Ep eit{Slz()]-Slz"()]}} x g i m 9 
=f (x(t) (tl Dat) De®) (12.85) 


where ®|k(t)] is the characteristic functional belonging to the probability 
distribution Py, so 


BR] = | ESOO PODE) (12.86) 


Equation (12.85) then answers our challenge to express the answer 
in a form valid after the averaging. It involves evaluation of the double 
path integral. How to evaluate it is, of course, another question, but the 
methods discussed in this book may be useful. In these sections we are 
discussing only how various problems may be formulated. 

As an example of the application of Eq. (12.85), suppose f(t) is 
gaussian noise with zero mean and correlation function A(¢,t’) as in 
Eq. (12.40). We must evaluate 


_ ff aseos > 
J / i (12.87) 
x exp {—$ff[z() — x'e t) — x (t)]Alt, t) dé dt’} Dalt) Dr (t) 


Because in the new factor at least the x and x’ appear only quadrati- 
cally, some of the methods previously discussed for quadratic forms may 
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be useful. Of course, if S[z] is itself quadratic, corresponding to a har- 
monic oscillator, the path integrals can be evaluated exactly by using 
the methods of Sec. 3-5. 


INFLUENCE FUNCTIONALS 


Now we wish to discuss the behavior of a quantum-mechanical system 
whose general coordinate we call x in interaction with another quantum- 
mechanical system whose coordinate we call X | We shall suppose that 
all measurements which are to be made are on system x only, and no 
direct measurements of the system X will be made. For example, we 
may be interested in how an atom makes transitions because it is in 
the electromagnetic field and can radiate. We contemplate studying 
only the atom and will not directly measure the light coming from it; 
then x are the atomic coordinates and X the coordinates of the field. 
If we study it the other way — that is, if we only observe the light 
from the atom, emitted, absorbed, or scattered, but never ask for any 
quantity directly involving the atom’s variables — then we may use our 
present analysis with x being the coordinates of the electromagnetic field 
and X those of the atom. If, for example, the theory of the index of 
refraction is wanted, then x are again the field coordinates and X the 
coordinates of the matter through which the light goes. For one further 
example, suppose the behavior of an electron in a crystal (or an ion 
in a liquid) is to be studied: the measurements to be analyzed involve 
directly only the position of the charge, not the material of the crystal. 
For example, we might wish the current (electron velocity) generated 
in some circumstance, but we are not contemplating correlations with 
the number of phonons produced. Then z can be the coordinates of the 
electron and X all the other coordinates of the matter of the crystal. 

Let S{x(t)] be the action of system x, So|X(t)] that of the environ- 
mental system alone, and Sint l(t), X (¢)] that of the interaction between 
the environmental system X and the system of interest x. The action 
of the combined system is Sle (t) + Sol X (t)] + Sinte (t), X (t)|, and the 
probability of any event involving the combined system can be evaluated 
from the double path integral, an obvious generalization of Eq. (12.81), 
and now written as 


'X stands for any number of coordinates — this other system may be, and gener- 


ally is, very complex. We shall just carry one X variable, but nothing essential will 
be lost. 
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r= ||| SEO- SOE XE SAXO (12.88) 


+ Sintle(t), X(t)] — Sin le (t), X HY Dat) DX (t) Dx (t) DX) 


But, if we need no measurements on system X and if only the depen- 
dence on z(t) need ever be studied, then we can write our answer in the 
form 


— | | ASOS O Ft. A Delt) Dr 
J J) F[z(t), x'(t)| Da(t) Da’ (t) (12.89) 


where we shall call the functional F'[z(t), z'(t)] the influence functional. 
It is a functional of the two functions z(t) and z’(t), and for this partic- 
ular problem it is given by | 


Flat = J Í E T OERE (12.90) 


final 


+ Sile (E), XO] — Sinela’ (8), XON} D(H) DX) 


The sum ranges over all possible final states of X. This is because no 
measurement on X is to be taken, and all final states of the environment 
are possible. Therefore we must add together the probabilities (i.e., the 
J functions of Eq. (12.88)) of all. In coordinate representation, for 
example, 2 just means that at the final time tẹ after we are no longer 


meede in the interaction we must take A (tp) = XK (ty) = Xp and 
integrate over all Xp. , 

To summarize, the behavior of a system in any environment can be 
discussed in terms of a double path integral like Eq. (12.89), where F 
is a property of the environment — its “influence” on the system. It 
summarizes all of the environment that is relevant to x(t). Two different 
possible surrounding conditions, say, A and B, might physically be very 
differently constructed; nevertheless, if they happen to lead to the same 
functional F’, they are indistinguishable as far as the behavior of the x 
system is A 

This F is somewhat analogous to the use of “external force” in sepa- 
rating the behavior of interacting systems classically. We can analyze the 
motion of x alone provided we know what force is produced (as a func- 
tion of time) by the environment. These newtonian equations of motion 
of x alone are the rough analogue of Eq. (12.89), whereas Eq. (12.90) 
corresponds to the calculation of the force produced by a given environ- 
ment. Two different environments which produce the same force on x 
are equivalent. Actually, the analogy is only rough; for F contains the 
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entire effect of the environment including the change in behavior of the 
environment resulting from reaction with x. In the classical analogue, 
F would correspond to knowing not only what the force tis as a function 
of time, but also what it would be for every possible motion x(t). The 
force for a given environmental system depends in general on the motion 
a(t), of course, since the environmental system is affected by interaction 
with the system of interest z. 

We are therefore led to study the properties of influence functionals. 
We shall be content to list a few such rules and give some suggestions 
on how they are arrived at. 

Rule I: 


Fla(t),«'(t)] = F* [x (t), 2(t) (12.91) 


where the asterisk means complex conjugate. 

Rule II: If the argument functions z(t) and 2’(t) are equal for t 
exceeding some value te, then F does not depend upon the actual values 
of x(t) for t > te. 

Rule II: If F; is the influence functional for a particular environment 
i and we do not know what the environment actually is but know only 
that the probability of its being z is w;, then the effective influence 
functional (for calculating all probabilities) is 


a 


Rule IV: If the system x is simultaneously in interaction with two 
external systems A and B, and if A and B do not interact directly with 
each other, and if there is no correlation between their initial conditions, 
then 


F=F,-Fp (12.93) 


where F', is the influence functional if A alone were interacting and Fg 
is that if B alone were interacting. 

Rule V: If the functional F can be adequately approximated by the 
form 


Fe O= exp {i fe) -OE at} (12.94) 
then the system v is acting as though under a classical force f(t) with 
action of interaction f x(t)f(t) dt. If it is of the form 

Flx(t), z ©] = Plet) — x (t)] 


where ®|k(t)| is any functional, then the environment is equivalent to a 
classical but uncertain force f(t), where ® is the characteristic functional 
for the distribution of f(t). 
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That rule I is true is evident directly from Eq. (12.90). 

This expression also explains rule II, but in a much more subtle way. 
Note that for any given system with any definite action Sp(X) and any 
given initial state 


Ss J J et SoiX()]-SoIX OI} D X(t) DX'(t) = 1 (12.95) 
final 

This is because the integrals and the sum over final states are equivalent 
to 


J K(X, ty; X,,t,)K" (Xp, ty; X!,t,) dX, = 6(X, — X’) (12.96) 


by Eq. (4.37). Thus, if the initial wave function were Y(Xa), we would 
multiply by ~(X,)v*(X/) as we did in Eq. (12.79) and integrate to get 


J §(X, — XLIX JY (X!) dX, dX! = J w(X)PdX=1 (12.97) 


Now notice that, if we put z’(t) = x(t) for all time in Eq. (12.90), we 
have an expression just like Eq. (12.95) where the effective (and definite) 
action 1s 


Sp[X(t)] = So[X (t)] + Sint (z(t), X (0) 
with 
Sp[X'(t)] = So[X"()] + Sins le), X) 


as required, as long as x’(t) = x(t). Hence F|z(t), z(t)] = 1. 

The same argument limited to the time range te < t < ty, using a 
relation like Eq. (12.96) but with ta, Xa replaced by te, Xe, shows that, 
if x’ (t) = x(t) for t > te, the dependence of F on z(t) for t > te drops 
away, because the right side of Eq. (12.96) does not depend on z(t) for 
Le bes 

Rule III is an evident result of the fact that probabilities are deter- 
mined by adding the value of J over various circumstances. 

Rule IV is evident from Eq. (12.90) when it is realized that the 
conditions of the rule imply that the action that goes into Eq. (12.90) is 
So AXA (t) + SEA ey), AA (t)| + So B| XB (t)| + Sime Blet), Ap) and 
that the exponential of the sum becomes a product, as does the integral 
F, if the initial state is itself a product of wave functions. 

Rule V is merely a statement of our results shown in Eqs. (12.82) 
and (12.85). 


These are some of the general properties of influence functionals. 
Calculations with them involve the various methods for doing path inte- 
grals applied to Eq. (12.89). We shall conclude this section by discussing 
certain important influence functionals. 
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Just as gaussian probability distributions and gaussian noise distri- 
butions are simple and important, so influence functionals which depend 
on x(t), z(t) as an exponential of a quadratic form — which we shall 
call gaussian influence functionals — are particularly important. 

First, if the environment is a set of harmonic oscillators in their 
ground states (or at a given temperature) coupled linearly to the sys- 
tem of interest x, evaluation of Eq. (12.90) shows that F is gaussian. 
But gaussian influence functionals, like gaussian probabilities, occur in 
good approximation in a much wider class of situations, namely, where 
the effect is the result of a very large number of influences, each of which 
by itself has little effect. For example, consider an atom in weak inter- 
action with each of the large number of atoms of an environmental gas. 
The influence of one atom A is very small, so its influence functional F'4 
differs only slightly from 1. However, in view of rule IV, the complete F 
is the product of many such factors, which becomes (nearly) the expo- 
nential of the sum of a small contribution from each. This contribution 
expanded to first and second order in the interaction with each atom 
leads to influence functionals of the gaussian type. 

As an application of this conclusion, a piece of metal placed in a cav- 
ity resonator affects the resonator in a simple linear way summarizable 
by one impedance function, even though the multitude of electrons in 
the metal behave in such a complex manner. The influence functional 
of the metal (X) on the cavity oscillator (x) is nearly a gaussian, and 
to this extent the metal is equivalent to some set of harmonic oscillators 
which would produce the same influence functional. 

The most general exponential functional involving x(t) and x’(¢) in 
linear form is 


F[x(t), 2’ (t)] = exp {ifax(t)f(d) dt —ifz' (t)g(t) dt} (12.98) 


for arbitrary and complex f(t) and g(t). If this is to be an influence func- 
tional, however, it must satisfy the conditions of our five rules. Rule I 
requires g(t) = f*(t), and rule II implies g(t) = f(t); hence g and f are 
equal and real. Thus the most general linear functional is that equivalent 
to the action of a classical external force in accordance with rule V. 

We need not discuss this simple case further; for it is completely ana- 
lyzable just by adding —2(t) f(t) to the hamiltonian of the unperturbed 
problem. If the exponent has both quadratic and linear terms, the linear 
term can be factored out, so via rule IV we can say it is a classical force 
plus the effect of a purely quadratic functional. 

The most general exponential functional which involves its argu- 
ments purely quadratically is of the form 
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Fla(t), 2’ (t)| = (12.99) 


T pt 
exp -J | la(t, t )a(tia(t) + BEE ta’ Ort) 


+ y(t, t )a(t)a'(t!) + Elt, tJe (tet) de’ ar 


for arbitrary and complex! a, 3, y, and 6. The integrals on t’, t are over 
the entire interesting range of time, but we always take t > t’. This is no 
loss of generality, of course, but it is convenient for later analysis. For 
this to be a satisfactory influence functional, we must have from Rule I 


Bt) =a Gt) (12.100) 
and | 
ATSE EL) (12.101) 


Rule II gives us a great deal of information, for putting z(t) = 2’(t) 
for t > te and, assuming t > te, t < te, the expression (which is part of 
the integral in Eq. (12.99)) 


oe f Talt, EEEE) + Blt, Cela E) (12.102) 
+ y(t, talte (t) + 6(¢,t/)a(t)ax(t’)| dt dt 


must be independent of x(t) for t > te and arbitrary z(t’) and of z(t’) 
for t < te. This requires that 


5(t, t) = —a(t, t’) 
y(t, t) = —B(E, t) 


as long as t > te, t < te. But since te is arbitrary, Eqs. (12.103) must 
hold for all t, t (under the continuing restriction t > t’). 

Therefore, the most general gaussian influence functional depends on 
only one complex function a(t, t’) and is of the form 


(12.103) 


Fla(t), 2'(t)] = (12.104) 


T pt 
exp l- | a(t, t jet) — a* (t,t )x'(t')\[a(t) — xr (t)] at a 


l These functions are defined only for t > t’. 
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In the case that a(t,t’) is real, say A(t,t’), our functional is equiva- 
lent to the exponential of Eq. (12.87), and we have the equivalent of a 
noisy classical perturbation. In quantum-mechanical systems & is gen- 
erally complex. A special case of importance is when a(t, t’) = a(t—t’) 
depends only on the time difference t — t’. We are then dealing with 
an environmental system which has average properties independent of 
absolute time. 

To help understand some of the properties of Eq. (12.104), we shall 
ask for the probability that the x system makes a transition from energy 
level n to some other orthogonal level m during an interval of time T 
in the case that @ is very small and we can use perturbation theory. If 
we expand F in Eq. (12.104), the leading term, 1, gives nothing because 
the states are orthogonal. The next term, linear in a, has four pieces. 


One is — Í / a(t, t’)x(t)x(t’) dt’ dt. When this is substituted for F in 


0 JO 
Eq. (12.89), and the resulting J is used in Eq. (12.83) with Y = n and 
X = Ọm, the integral over paths z(t) and x'(t) is seen to be the product 
of two factors. One, the integral over x(t), involves 


fel- ffo ez) (t) dt! dt 


and when passed through Eq. (12.83) results in the transition element 
(see Chap. 7) 


(r - ffo (t,t )æl(t)elt") dt! dt n) = (12.105) 
-| “ha a(t, t’)(mla(t)a(t’)|n) dt’ dt 


—iS|x 


Dz (t) 








The integral over 2/(t) is just f e~*5] Dz’ and results in the complex 
conjugate of the transition element (m|1|n). Analyzing the other three 
pieces in a similar way, the total transition probability is 


P(n > m) LUNs rahe 


mla(t)x(t')|n)(m|1|n) 
mlx(t)|n)"(m|x(t')|n) 
(m|a(t)|n)(m|x(t')|n)" 
(m|1|n)(ma(t)a(t!)|n)"] dt dt 


i (12.106) 
+a( (t, t 
a(t, 
a” (t, 


If m and n are orthogonal, (m|1|n) = 0. If S[a] comes from a constant 
hamiltonian with energy level &, for state k, then 


(miz(t)|n) = tmn Er Er) (12.107) 


( 
( 
) 
J 


t 
t 


12-8 Influence functionals 351 


Only the middle two terms of Eq. (12.106) survive, and they are complex 
conjugates of each other, so that 


T pt 
P(n 22% m) as 2 ae Ne l / ante Pm — En) tt) dt’ a 
O vO 
(12.108) 


Problem 12-3 For m = n, verify P(m = m) =1-— 0, P(m >n) 
as required by conservation of probability. 


In the case of a time-steady environment a(t, t’) = a(t—t’). Suppose 
we define the Fourier transform 


a(w) = f a(r)e’ dr (12.109) 


(a(t) is not defined for t < 0.) Then since P(n — m) in Eq. (12.108) 
is proportional to the time interval over which the integrals extend, we 
can define a rate of transition per second and find the probability of 
transition 


P(n — m) per second = 2|£mn| ar(Em — En) (12.110) 
where we have broken a(w) into real and imaginary parts 
alw) = aplw) + iar(w) (12.111) 


We may note that, for a disturbance by a classical force under gaus- 
sian noise, a(7T) is real (see Eq. 12.87) and the real part of a(w) is the 
power-spectrum function of the noise as defined in Eq. (12.50). So, for 
such classical noise systems 


AR(W) = aR(—w) (12.112) 
and in first-order perturbation 
Rate of transition n — m = Rate of transition m —> n (12.113) 


and both rates are proportional to the power P(w) at the frequency of 
the transition. Thus classical forces have equal probability of causing 
transitions up and down. 

Another interesting example is when the environment cannot supply 
energy with any reasonable probability. For example, it may be initially 
in the ground state or at zero temperature. We shall call such an envi- 
ronment “cold.” For such a situation transitions of the system x going 


12-9 
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up in energy (Em > En) are unlikely. Hence for such cold environment 
systems 


ar(w)=0 forw>0 (12.114) 
and for first-order perturbations 
Rate of transition n— m=0 if Em > En (12.115) 


Since any a(w) can be written as the sum of one of the type shown 
in Eq. (12.112) plus one of the type shown in Eq. (12.114), it is readily 
apparent that any time-independent gaussian functional is equivalent to 
a system in some cold environment acted on by a fluctuating classical 
force described by a gaussian expression. This conclusion follows from 
the fact that the product of any two gaussian functions is also a gaussian 
and from rule IV. If the interaction of one environment on the system is 
represented by Aj;(t,t’) in the manner of Eq. (12.87) and the interaction 
of the other environment as Ag(t, t’), then the single interaction term in 
the single resulting gaussian functional is A; + Ag. 


INFLUENCE FUNCTIONAL FROM 
A HARMONIC OSCILLATOR 


We shall next give an example of how F can be worked out from 
Eq. (12.90) for an environment consisting of a harmonic oscillator with 
coordinates X, in the ground state, coupled to x linearly through an 
interaction Sing[x(t), X(t)] = C f x(t)X(t) dt. We suppose the oscillator 
of unit mass and frequency wọ, so that 


So[X(t)] = 5 |O —weX*(t)] dt (12.116) 
Then 
F|x(t), (12.117) 


x(t')| = 
Di, ap {i [X20 — Lue X?(t) + Ca(t)X (E) at} 


exp f- JEX — sup X’ (t) + Ca’ (t)X'(t)] it} 
DX) DX 


where m is the final state and the initial state is the ground state. The 
integral over X is clearly gaussian, and in fact we have already done it; 
for it is exactly the transition amplitude Gmo worked out in Sec. 8-9 for 
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a forced harmonic oscillator. The forcing function there called f(t) is 
here Ca(t).! It is therefore given by Eq. (8.145) with n = 0 or 


Coa 
Gro = “eae ——— Goo (12.118) 
with Goo given in Eq. (8.138) and 6* in Eq. (8.143) replacing f(t) b 
C'x(t). Likewise, the integral over X’ is the complex conjugate of a simi- 
lar expression but with f(t) replaced by C'a2’(t) this time. We distinguish 
values of this substitution with a prime. Then the sum over final states 
in Eq. (12.117) gives us 


10* ae —i 3 ae e *B / 
= 24 Cms mo = =), om Cy) ar Gt 00 = GooGooe” l 


(12.119) 





Substitution from Eqs. (8.138) and (8.143) produces, as expected, an F 
of the form of Eq. (12.104) but with 


t,t’) = oe ~ two (t—t") 12.120 
at = iy (12.120) 


For example, the terms in xz’ in Eq. (12.104) come from the 6*@’ in the 
exponential; for this product by Eq. (8.143) is 


T ‘ T . 
j aen 7 | sje a (12.121) 
0 
/ eiwo (t— —t ‘) / —iwo(t—t’) / 
-5f feo E) E E | dt! di 


'The reader may prefer to observe that Eq. (12.117) is 


C? 


Jug 





Fle(t) 2 (0) = [SSK Xp toi Xasta) K" (Xp tpi Xh, ta)to(Xa)4i (XL) dXa dX; dX, 


where K is the kernel of Eq. (3.66) for a forced harmonic oscillator with f(t) = Ca(t) 
and K’ is that with f(t) = Cx’ (t). ¢o(X) is the wave function of the oscillator in 
the ground state. All variables X,, X,, and X, then appear in a simple gaussian 
way and may be directly integrated. We shall then find it as easy to do the finite- 
temperature case. For here state n is the initial state with probability proportional 
to e ÊEn so, in view of rule III, the resulting F is obtained by the expression above 
but with the wave functions Po (X, )ġo (Xa) replaced by 


const X bn(Xa)on (Xue Oe 


that is, by the density matrix p(X,, X4) worked out in Prob. 10-1. The integrations 
are again gaussian. 
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The quantity a(w) defined in Eq. (12.109) is therefore (see Eq. (5.17) 
and the Appendix) 





Oo Je OF 1 
alw) a ee, l oT WoT p—twT qr = —4 P.P. i ae 7O(Wo at w) 
0 


E 2WwW0 Wo Wo + WwW 
(12.122) 
so that the real part of a(w) is 
C? 
ap(w) = rtu) (12.123) 
ZWO 


This is zero for positive w. As expected, we have a “cold environment” 
as specified in Eq. (12.114). 

If many independent oscillators at different frequencies are all acting, 
then by rule IV, their ag(w) functions add; so any cold system (to this 
gaussian approximation) is equivalent to a continuum of oscillators in 
their ground state. This follows, since any function ar(w), for negative 
w, can be built up of delta functions of the form of Eq. (12.123). 

Another interesting example is the interaction with an oscillator at 
finite temperature. If the temperature is T, the initial state is energy 
state n with relative probability e~“/*". For our case, the absolute 
probability is | 


i Se OED eee One (12.124) 
If the initial state were n, the influence functional would be 


= Ge. (12.125) 


instead of the form in Eq. (12.119). Using rule III, we add these with 
probabilities wn, so our final F is 


P=) CG ae (ae a) (12.126) 
The sum is difficult to work out directly from Eq. (8.145), but it is 

_ 1 8" B JAB a 
pe G9 Gone exp  phwo/kT 1] (12.127) 


The agr(w) that results from this in place of Eq. (12.123) is 


rC? | chwo/kT 1 


ghao/ET — 1 


and sums of such expressions of many oscillators constitute the environ- 
ment. Now transitions can go down in energy (w < 0) or up in energy. 
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We note that if w > 0, the first delta function fails, whereas if w < 0, 
the second fails, and that indeed 


ap(—|w|) = el ap (+|u)) (12.129) 
This definite relation means that in perturbation theory, if En > Em, 


probability per second of a transition up (m —> n) -(En—Em)/kT 
aaa $s $C CP n m 
probability per second of a transition down (n —> m) 
(12.130) 


by using Eq. (12.110). 

Thus, if the system x occupies states n with relative probabilities 
e~Fn/kT the net number of up and down transitions will balance out and 
the system will be in statistical equilibrium for weak perturbation with 
the environment. ‘his is just what we expect for the laws of statistical 
equilibrium. Any environment at temperature T producing a quadratic 
influence functional will have the property of Eq. (12.129). 

For an atom as system z in interaction with the electromagnetic field 
at temperature T as the environment, ag(w) is given by an expression 
like Eq. (12.128) integrated over all the modes of the field of various 
frequencies wo. It can be split into the cold environment of Eq. (12.123) 
plus a noisy external force: 

„a0 


arlw) = — |d(wo +w) + 


— T [ô (wo -+ w) + (wo — Ww) 


(12.131) 


The first term produces only transitions down in energy and is called 
spontaneous emission. ‘he second produces transitions up and down 
with equal ease and is called induced emission or induced absorption. 
We say that the transition is induced by an external force or noise 
whose mean-square strength at frequency w varies with temperature as 
1/(e*’/F — 1). This is the way Einstein first discussed the blackbody- 
radiation laws. As we see here, any environment giving a quadratic 
influence functional at temperature T (we say it is an environment re- 
sponding linearly) can be treated in the same way. Many people have 
extended Einstein’s argument to other systems, like the voltage fluctu- 
ation noise in a resistor at temperature T. The first term measures the 
rate at which energy is taken out of our system x in a one-way manner. 
It measures the amount of “dissipation” produced by the environment 
(e.g., electrical resistance of a metal or radiation resistance of the electro- 
magnetic field). At temperature T we can then say that things behave 
as if, in addition to the dissipation, there is a noisy signal generated by 
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the environment whose mean square at each frequency is proportional 
to the dissipation at that frequency and to 1/(e””/*? —1). This is called 
the dissipation-fluctuation theorem. 

We cannot pursue this subject further here.! 


CONCLUSIONS 


In these applications of path integrals to probability theory it is evident 
that, if the integrands are gaussian, we can make considerable use of 
the technique. But these problems are precisely those for which other 
methods, not requiring path integrals, are also available to solve the 
problem. One may reasonably question the real utility of the path inte- 
grals. We can only say that if the problem is not gaussian, it can at least 
be formulated and studied by using path integrals — and that we might 
hope that someday, when the techniques of analysis improve, more can 
be done with it. The only example of a result obtained with path inte- 
grals which cannot be obtained in simple manner by more conventional 
methods is the variational principle discussed in Chap. 11. We hope 
that further study of these methods may yield more such results. 

In the meantime, however, it is worth pointing out that the path 
integral method does permit a rapid passage from one formulation of 
a problem to another and often gives a clear or quick suggestion of 
a relation which can then be more slowly derived in a more ordinary 
fashion. 

With regard to application to quantum mechanics, path integrals 
suffer most grievously from a serious defect. They do not permit a dis- 
cussion of spin operators or other such operators in a simple and lucid 
way. They find their greatest use in systems for which coordinates and 
their conjugate momenta are adequate. Nevertheless, spin is a simple 
and vital part of real quantum-mechanical systems. It is a serious limita- 
tion that the half-integral spin of the electron does not find a simple and 
ready representation. It can be handled if the amplitudes and quantities 
are considered as quaternions instead of ordinary complex numbers, but 
the lack of commutativity of such numbers is a serious complication. 


l The subject of influence functionals is discussed in detail by R.P. Feynman and 


F.L. Vernon, Jr., The Theory of a General Quantum System Interacting with a 
Linear Dissipative System, Ann. Phys. (N.Y.), vol. 24, pp. 118-173, 1963, and by 
W.H. Wells, Quantum Formalism Adapted to Radiation in a Coherent Field, Ann. 
Phys. (N.Y.), vol. 12, pp. 1-40, 1961. An application to calculation of mobility of 
the polaron is in R.P. Feynman, R.W. Hellwarth, C.K. Iddings, and P.M. Platzmann, 
Mobility of Slow Electrons in a Polar Crystal, Phys. Rev., vol. 127, pp. 1004-1017, 
1962. 
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Nevertheless, many of the results and formulations of path integrals 
can be reexpressed by another mathematical system, a kind of ordered 
operator calculus.! In this form many of the results of the preceding 
chapters find an analogous but more general representation (only for 
the special problems of Chap. 11 is the generalization not known) in- 
volving noncommuting variables. For example, in this chapter discussing 
influence functionals it must have struck the reader that an environment 
coupled not to the coordinate x but to a noncommuting operator, such 
as the spin, would be an important and interesting generalization. Such 
things cannot be conveniently expressed in the path integral formulation 
but can be easily expressed in the closely related operator calculus. 

An effort to extend the path integral approach beyond its present 
limits continues to be a worthwhile pursuit; for the greatest value of 
this technique remains in spite of its limitations, i.e., the assistance 
which it gives one’s intuition in bringing together physical insight and 
mathematical analysis. 


1R.P. Feynman, An Operator Calculus Having Applications in Quantum Electro- 
dynamics, Phys. Rev., vol. 84, pp. 108-128, 1951. 


Appendix 


Some Useful Definite Integrals 


co 2 T 2 
/ eae +bz dr = eee /4a (A.1) 
= V —a 


for Re{a} <0 but a Æ 0 


= a(x gy b(x2—-r)” — ml ab E 2 
J e e dx Vapor | ae £2) (A.2) 


for Re{a+b} <0 but a+b #0 


/ exp É + iba?) dz = 4/ exp{i2vab} (A.3) 
0 


for a,b real and positive 


a ia ib dr | in i 
i E a ee ee = bh)? 
| op] panto T r Tepi ava + vB) 





for a,b real and positive (A.4) 
[ a l ia 2) dr 
——— +> 
0 ar (T — ryr| 
fin Jat+vb i a 


for a,b real and positive 


n/2 . 
| eten ain 2g dt = al(a — 1)e? +1] (A.6) 
0 
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a 


[ eP 8 sin(psin x) sin(ax) dx 
0 Tp 
Qa! 





S eP 5" cos(p sin x) cos(ax) dx 
0 


0 om m 


for k >—1, A>0, m>0O 


7 


e™t dt = 275(w) 


et dt = nô — rô P.P. ae li 
l were, Nee (<) Oana 





ro 
ft% m3 ~ Vol T f(k [see Sec. 4-3] 


ty pt ta rtp 
J f(t, 8) dsdt= | f(s,t)dsdt 
ta Jta ta Jt 





(A.7) 


(A.8) 


(A.9) 


(A.10) 


(A.11) 


(A.12) 


Appendix 


Notes 


These notes were added by the editor to explicate, amplify, or update 
the book’s discussion. A relevant note is signaled in the text through 
the symbol’. 


Throughout: The book is often careless in distinguishing between 
“probability,” “relative (i.e. unnormalized) probability,” and “probabil- 
ity density”. Similarly for amplitude. 


Page 3: In this book a sequence of two events is labeled as a (ini- 
tial) to b (final); a sequence of three events is labeled as a to c to b; a 
sequence of four events is labeled as a to d to c to b; and so forth. This 
scheme for inserting intermediate events proves its value many times. 
(See particularly pages 21 and 126.) 


Page 3: “This particular experiment has never been done in just 
this way.” This statement was true at the date of publication (1965). 
The remarkable experimental progress since that date can be glimpsed 
through the following publications: 


Claus Jonsson, “Elektroneninterferenzen an mehreren künstlich her- 
gestellten Feinspalten,” Zeitschrift fur Physik 161 (1961) 454- 
474. Translated as “Electron diffraction at multiple slits,” Amer- 
ican Journal of Physics 42 (1974) 3-11. (Wave-like properties of 
electrons. ) 

A. Tonomura, J. Endo, T. Matsuda, T. Kawasaki, and H. Ezawa, 
“Demonstration of single-electron buildup of an interference pat- 
tern,” American Journal of Physics 57 (1989) 117-120. (Simul- 
taneous wave-like and particle-like properties of electrons.) 

Movies of the above experiments are at 
<http://www.hqrd.hitachi.co.jp/em/doubleslit.cfm>. 

R. Gahler and A. Zeilinger, “Wave-optical experiments with very 
cold neutrons,” American Journal of Physics 59 (1991) 316-324. 
(Wave-like properties of neutrons.) 


361 


362 Appendix: Notes 


Olaf Nairz, Markus Arndt, and Anton Zeilinger, “Quantum inter- 
ference experiments with large molecules,” American Journal of 
Physics 71 (2003) 319-325. (Wave-like properties of Ceo.) 

Michael S. Chapman, David E. Pritchard, et al., “Photon scatter- 
ing from atoms in an atom interferometer: Coherence lost and 
regained,” Physical Review Letters 75 (1995) 3783-3787. (Ob- 
serving atoms as they pass through the double slits, as discussed 
on page 7 of this book.) 

E. Buks, R. Schuster, M. Heiblum, D. Mahalu, and V. Umansky, 
“Dephasing in electron interference by a ‘which-path’ detector,” 
Nature 391 (1998) 871-874. (More on observing electrons, as or 
after they pass through the double slits.) 

Paul R. Berman, editor, Atom Interferometry (Academic Press, San 
Diego, 1997). 

Helmut Rauch and Samuel A. Werner, Neutron Interferometry: Les- 
sons in Experimental Quantum Mechanics (Oxford University 
Press, New York, 2000). 


Page 21: In the generalization to time, it helps to think of the holes 
in Fig. 1-9 as being covered by shutters that open only during specific 
time intervals. Then a path is specified by a prescription like “through 
hole Es at time slice t17, then through hole Dg at time slice t29,” etc. In 
the limit that the screens are drilled away to nothingness, the shutters 
are always open. 


Page 22: Feynman’s hunch was wrong: in fact other consistent 
interpretations are possible. One such alternative is the de Broglie- 
Bohm formulation, described in 


David Bohm and B.J. Hiley, The Undivided Universe: An Ontological 
Interpretation of Quantum Theory (Routledge, London, 1993). 


Page 23: The desired “statistical mechanics of |the] amplifying ap- 
paratus” is being worked out under the name of decoherence. The vast 
technical literature of this field is best approached through 


W.H. Zurek, “Decoherence, einselection, and the quantum origins of 
the classical,” Reviews of Modern Physics 75 (2003) 715-775. 
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Page 26: The entity called the kernel here is more often called the 
“propagator” or the “Green’s function.” See, for example, 


L.S. Schulman, Techniques and Applications of Path Integration (Wi- 
ley, New York, 1981). 

Hagen Kleinert, Path Integrals in Quantum Mechanics, Statistics, 
Polymer Physics, and Financial Markets, third edition (World 
Scientific, River Edge, New Jersey, 2004). 


Page 28, problem 2-2: Hint: This problem can be solved directly, 
but it is easier if you first integrate by parts to prove the theorem that, 
for a harmonic oscillator, 


m to 


Sci = D jx(t)e(t)], 


Page 28, answer to problem 2-3: If T = tp — ta, then 


3 = m(Xp — La) : JIo kaa)  fPT? 
a oT 9 Am 





Page 33, equation (2.22): (1) This kernel has dimensions 1/[length] 
More generally, in s-dimensional configuration space the kernel has di- 
mensions 1/{length]*. (2) The factor A is a complex quantity with phase 
T/4 and the dimensions of length. (3) In contrast to the situation for the 
Riemann sum, the path integral normalizing factor A~™ goes to infinity 
as € — 0 and the subset of paths becomes more representative. (4) We 
don’t really sum over all paths, but over all paths moving forward in 
time. 


Page 47: Sections 3-2 and 3-3 can be skipped on a first reading. 


Page 56, equation (3.40): This probability density is unnormal- 
ized. (As reflected by the fact that it has the wrong dimensions!) The 
normalized version, which is used in Fig. 3-6, is 


P(e’) = Fe (Cus) = Clu)? + Su) = $(u)P). 


Page 63, equation (3.60): This expression is correct in magni- 
tude, but the phase (i.e. the branch of i1/?) is ambiguous. The correct 
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expression (see for example N.S. Thorber and E.F. Taylor, “Propaga- 
tor for the simple harmonic oscillator,’ American Journal of Physics 66 
(1998) 1022-1024) is 


1/2 
= mw T 
— = —|] + 2t LIR) 
e (or) where 6 ral + 2trunc(wT'/7)| 


Here “trunc” denotes the “truncation” function: trunc(z) is the largest 
integer less than or equal to zx. 


Page 64, problem 3-10: Hint: First prove that, for this system, 


Page 98: The argument leading from the probability density (5.4) 
to the wave function (5.5) is suggestive, not definitive. Any argument 
of this sort cannot uncover the phase of the wave function. This phase 
might be a physically insignificant constant et’, in which case ignoring it 
would be perfectly permissible. But the phase factor might be a function 
of momentum e’?(?), which does not change the probability density for 
momentum, P(p), but which can dramatically change the probability 
density for position, P(2). 


Page 107: The argument leading from the probability (5.23) to the 
amplitude (5.25) has the same defect remarked upon in the previous 
paragraph. 


Page 130: We distinguish between the arbitrary field point r and the 
location of the particle x within that field. In chapter 6 this distinction 
is largely pedantic, but in chapter 9 (Quantum electrodynamics) it is 
essential. 


Page 142, equation (6.62): This result holds in the limit of long 
| 2 R? 
times è —. 0. Hint: Use the substitution z? = ——— be — 
Qhtp 2h(ty — te) 
equation (A.3). 
Page 204: This kernel has dimensions 1/|,/mass x length]”. 


Page 217, problem 8-4: Hint: First show that (for N odd) 


, and then 


(N-1)/2 . 
Y> (003)? — 2 (28)? + (@8)? - #228)? 


a= 


bo] eR 


1. 
L= 7 (Q6) + 
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Page 217, problem 8-5: It is also worth showing that 


CNCE CNEA A for n odd 
(Bo |1| Bo) (Sollo) (n — 1)!(A/2wa)?/? for n even 








Page 218: Explicitly, in sums over k (as in equations 8.89 and 8.93), 


N 


: Nai 
XT means > I (Fa) 
a=0 


k=1 


Page 233, equation (8.138): This equation is often written in the 
form (obtained through the use of equation A.12) 


L eye | 
Goo = exp l-a | | Parser ds a 


Page 236: Caution! Do not attempt this chapter without first 
reading chapter 8 (Harmonic oscillators) and working problems 8-3, 8-4, 
and 8-5. Do this even if you think you aren’t interested in harmonic 
oscillators, and even if you think you already know all about them. 


Page 247: The normalization, spelled out in more detail, is 
// D6 (Gy k Fa k)i, kål, kol ks Ai 1] dy w Az Kr 


Page 252: For polarization 1, this manipulation results in (using 
t= te, s = tg, so that we have our usual sequence of a to d to c to b) 


1 4 me J” z kel ) = = 
sat, 1S ees —4 TC te—t 
AML = A 2 a A dte J. dta e f A J dto J dzd 


x< o` (t/h) Em (tote) yyy (%e)Fr (Xe, teo) UN (xe) 


x e C/M En Beta) tN (Xa) Ji (Kas ta) Yr (xa)e H EL amta) 


Page 309, equation (11.41): The numerator is the kernel from £a 
to Ze at “time” ù, times f(£e), times the kernel from ze to Zp at “time” 
G, integrated over all possible values of ze: 


J k boD te TeK CoU Tao Oates: 


— OO 


Then use expression (10.32) for the kernel. 
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